The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2e2832562c877e6530b8480982d99a4ff90c6777 Mon Sep 17 00:00:00 2001
From: Alan Young <consult.awy(a)gmail.com>
Date: Fri, 9 Jul 2021 09:48:54 +0100
Subject: [PATCH] ALSA: pcm: Call substream ack() method upon compat mmap
commit
If a 32-bit application is being used with a 64-bit kernel and is using
the mmap mechanism to write data, then the SNDRV_PCM_IOCTL_SYNC_PTR
ioctl results in calling snd_pcm_ioctl_sync_ptr_compat(). Make this use
pcm_lib_apply_appl_ptr() so that the substream's ack() method, if
defined, is called.
The snd_pcm_sync_ptr() function, used in the 64-bit ioctl case, already
uses snd_pcm_ioctl_sync_ptr_compat().
Fixes: 9027c4639ef1 ("ALSA: pcm: Call ack() whenever appl_ptr is updated")
Signed-off-by: Alan Young <consult.awy(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/c441f18c-eb2a-3bdd-299a-696ccca2de9c@gmail.com
Signed-off-by: Takashi Iwai <tiwai(a)suse.de>
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 14e32825c339..c88c4316c417 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3063,9 +3063,14 @@ static int snd_pcm_ioctl_sync_ptr_compat(struct snd_pcm_substream *substream,
boundary = 0x7fffffff;
snd_pcm_stream_lock_irq(substream);
/* FIXME: we should consider the boundary for the sync from app */
- if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL))
- control->appl_ptr = scontrol.appl_ptr;
- else
+ if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL)) {
+ err = pcm_lib_apply_appl_ptr(substream,
+ scontrol.appl_ptr);
+ if (err < 0) {
+ snd_pcm_stream_unlock_irq(substream);
+ return err;
+ }
+ } else
scontrol.appl_ptr = control->appl_ptr % boundary;
if (!(sflags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN))
control->avail_min = scontrol.avail_min;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From edb25572fc7058db5a98223e11d2d50497178553 Mon Sep 17 00:00:00 2001
From: Stephen Boyd <swboyd(a)chromium.org>
Date: Wed, 23 Jun 2021 00:50:01 -0700
Subject: [PATCH] mmc: core: Use kref in place of struct mmc_blk_data::usage
Ulf reported the following KASAN splat after adding some manual hacks
into mmc-utils[1].
DEBUG: mmc_blk_open: Let's sleep for 10s..
mmc1: card 0007 removed
BUG: KASAN: use-after-free in mmc_blk_get+0x58/0xb8
Read of size 4 at addr ffff00000a394a28 by task mmc/180
CPU: 2 PID: 180 Comm: mmc Not tainted 5.10.0-rc4-00069-gcc758c8c7127-dirty #5
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
dump_backtrace+0x0/0x2b4
show_stack+0x18/0x6c
dump_stack+0xfc/0x168
print_address_description.constprop.0+0x6c/0x488
kasan_report+0x118/0x210
__asan_load4+0x94/0xd0
mmc_blk_get+0x58/0xb8
mmc_blk_open+0x7c/0xdc
__blkdev_get+0x3b4/0x964
blkdev_get+0x64/0x100
blkdev_open+0xe8/0x104
do_dentry_open+0x234/0x61c
vfs_open+0x54/0x64
path_openat+0xe04/0x1584
do_filp_open+0xe8/0x1e4
do_sys_openat2+0x120/0x230
__arm64_sys_openat+0xf0/0x15c
el0_svc_common.constprop.0+0xac/0x234
do_el0_svc+0x84/0xa0
el0_sync_handler+0x264/0x270
el0_sync+0x174/0x180
Allocated by task 33:
stack_trace_save+0x9c/0xdc
kasan_save_stack+0x28/0x60
__kasan_kmalloc.constprop.0+0xc8/0xf0
kasan_kmalloc+0x10/0x20
mmc_blk_alloc_req+0x94/0x4b0
mmc_blk_probe+0x2d4/0xaa4
mmc_bus_probe+0x34/0x4c
really_probe+0x148/0x6e0
driver_probe_device+0x78/0xec
__device_attach_driver+0x108/0x16c
bus_for_each_drv+0xf4/0x15c
__device_attach+0x168/0x240
device_initial_probe+0x14/0x20
bus_probe_device+0xec/0x100
device_add+0x55c/0xaf0
mmc_add_card+0x288/0x380
mmc_attach_sd+0x18c/0x22c
mmc_rescan+0x444/0x4f0
process_one_work+0x3b8/0x650
worker_thread+0xa0/0x724
kthread+0x218/0x220
ret_from_fork+0x10/0x38
Freed by task 33:
stack_trace_save+0x9c/0xdc
kasan_save_stack+0x28/0x60
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x180
kasan_slab_free+0x14/0x20
kfree+0xb8/0x46c
mmc_blk_put+0xe4/0x11c
mmc_blk_remove_req.part.0+0x6c/0xe4
mmc_blk_remove+0x368/0x370
mmc_bus_remove+0x34/0x50
__device_release_driver+0x228/0x31c
device_release_driver+0x2c/0x44
bus_remove_device+0x1e4/0x200
device_del+0x2b0/0x770
mmc_remove_card+0xf0/0x150
mmc_sd_detect+0x9c/0x150
mmc_rescan+0x110/0x4f0
process_one_work+0x3b8/0x650
worker_thread+0xa0/0x724
kthread+0x218/0x220
ret_from_fork+0x10/0x38
The buggy address belongs to the object at ffff00000a394800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 552 bytes inside of
1024-byte region [ffff00000a394800, ffff00000a394c00)
The buggy address belongs to the page:
page:00000000ff84ed53 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8a390
head:00000000ff84ed53 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x3fffc0000010200(slab|head)
raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff000009f03800
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00000a394900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000a394980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff00000a394a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff00000a394a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000a394b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Looking closer at the problem, it looks like a classic dangling pointer
bug. The 'struct mmc_blk_data' that is used after being freed in
mmc_blk_put() is stashed away in 'md->disk->private_data' via
mmc_blk_alloc_req() but used in mmc_blk_get() because the 'usage' count
isn't properly aligned with the lifetime of the pointer. You'd expect
the 'usage' member to be in sync with the kfree(), and it mostly is,
except that mmc_blk_get() needs to dereference the potentially freed
memory storage for the 'struct mmc_blk_data' stashed away in the
private_data member to look at 'usage' before it actually figures out if
it wants to consider it a valid pointer or not. That's not going to work
if the freed memory has been overwritten by something else after the
free, and KASAN rightly complains here.
To fix the immediate problem, let's set the private_data member to NULL
in mmc_blk_put() so that mmc_blk_get() can consider the object "on the
way out" if the pointer is NULL and not even try to look at 'usage' if
the object isn't going to be around much longer. With that set to NULL
on the last mmc_blk_put(), optimize the get path further and use a kref
underneath the 'open_lock' mutex to only up the reference count if it's
non-zero, i.e. alive, and otherwise make mmc_blk_get() return NULL,
without actually testing the reference count if we're in the process of
removing the object from the system.
Finally, tighten the locking region on the put side to only be around
the parts that are removing the 'mmc_blk_data' from the system and
publishing that fact to the gendisk and then drop the lock as soon as we
can to avoid holding the lock around code that doesn't need it. This
fixes the KASAN issue.
Cc: Matthias Schiffer <matthias.schiffer(a)ew.tq-group.com>
Cc: Sujit Kautkar <sujitka(a)chromium.org>
Cc: Zubin Mithra <zsm(a)chromium.org>
Reported-by: Ulf Hansson <ulf.hansson(a)linaro.org>
Link: https://lore.kernel.org/linux-mmc/CAPDyKFryT63Jc7+DXWSpAC19qpZRqFr1orxwYGMu… [1]
Signed-off-by: Stephen Boyd <swboyd(a)chromium.org>
Link: https://lore.kernel.org/r/20210623075002.1746924-2-swboyd@chromium.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 9890a1532cb0..ce8aed562929 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -28,6 +28,7 @@
#include <linux/errno.h>
#include <linux/hdreg.h>
#include <linux/kdev_t.h>
+#include <linux/kref.h>
#include <linux/blkdev.h>
#include <linux/cdev.h>
#include <linux/mutex.h>
@@ -111,7 +112,7 @@ struct mmc_blk_data {
#define MMC_BLK_CMD23 (1 << 0) /* Can do SET_BLOCK_COUNT for multiblock */
#define MMC_BLK_REL_WR (1 << 1) /* MMC Reliable write support */
- unsigned int usage;
+ struct kref kref;
unsigned int read_only;
unsigned int part_type;
unsigned int reset_done;
@@ -181,10 +182,8 @@ static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
mutex_lock(&open_lock);
md = disk->private_data;
- if (md && md->usage == 0)
+ if (md && !kref_get_unless_zero(&md->kref))
md = NULL;
- if (md)
- md->usage++;
mutex_unlock(&open_lock);
return md;
@@ -196,18 +195,25 @@ static inline int mmc_get_devidx(struct gendisk *disk)
return devidx;
}
-static void mmc_blk_put(struct mmc_blk_data *md)
+static void mmc_blk_kref_release(struct kref *ref)
{
- mutex_lock(&open_lock);
- md->usage--;
- if (md->usage == 0) {
- int devidx = mmc_get_devidx(md->disk);
+ struct mmc_blk_data *md = container_of(ref, struct mmc_blk_data, kref);
+ int devidx;
- ida_simple_remove(&mmc_blk_ida, devidx);
- put_disk(md->disk);
- kfree(md);
- }
+ devidx = mmc_get_devidx(md->disk);
+ ida_simple_remove(&mmc_blk_ida, devidx);
+
+ mutex_lock(&open_lock);
+ md->disk->private_data = NULL;
mutex_unlock(&open_lock);
+
+ put_disk(md->disk);
+ kfree(md);
+}
+
+static void mmc_blk_put(struct mmc_blk_data *md)
+{
+ kref_put(&md->kref, mmc_blk_kref_release);
}
static ssize_t power_ro_lock_show(struct device *dev,
@@ -2327,7 +2333,8 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
INIT_LIST_HEAD(&md->part);
INIT_LIST_HEAD(&md->rpmbs);
- md->usage = 1;
+ kref_init(&md->kref);
+
md->queue.blkdata = md;
md->disk->major = MMC_BLOCK_MAJOR;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From edb25572fc7058db5a98223e11d2d50497178553 Mon Sep 17 00:00:00 2001
From: Stephen Boyd <swboyd(a)chromium.org>
Date: Wed, 23 Jun 2021 00:50:01 -0700
Subject: [PATCH] mmc: core: Use kref in place of struct mmc_blk_data::usage
Ulf reported the following KASAN splat after adding some manual hacks
into mmc-utils[1].
DEBUG: mmc_blk_open: Let's sleep for 10s..
mmc1: card 0007 removed
BUG: KASAN: use-after-free in mmc_blk_get+0x58/0xb8
Read of size 4 at addr ffff00000a394a28 by task mmc/180
CPU: 2 PID: 180 Comm: mmc Not tainted 5.10.0-rc4-00069-gcc758c8c7127-dirty #5
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
dump_backtrace+0x0/0x2b4
show_stack+0x18/0x6c
dump_stack+0xfc/0x168
print_address_description.constprop.0+0x6c/0x488
kasan_report+0x118/0x210
__asan_load4+0x94/0xd0
mmc_blk_get+0x58/0xb8
mmc_blk_open+0x7c/0xdc
__blkdev_get+0x3b4/0x964
blkdev_get+0x64/0x100
blkdev_open+0xe8/0x104
do_dentry_open+0x234/0x61c
vfs_open+0x54/0x64
path_openat+0xe04/0x1584
do_filp_open+0xe8/0x1e4
do_sys_openat2+0x120/0x230
__arm64_sys_openat+0xf0/0x15c
el0_svc_common.constprop.0+0xac/0x234
do_el0_svc+0x84/0xa0
el0_sync_handler+0x264/0x270
el0_sync+0x174/0x180
Allocated by task 33:
stack_trace_save+0x9c/0xdc
kasan_save_stack+0x28/0x60
__kasan_kmalloc.constprop.0+0xc8/0xf0
kasan_kmalloc+0x10/0x20
mmc_blk_alloc_req+0x94/0x4b0
mmc_blk_probe+0x2d4/0xaa4
mmc_bus_probe+0x34/0x4c
really_probe+0x148/0x6e0
driver_probe_device+0x78/0xec
__device_attach_driver+0x108/0x16c
bus_for_each_drv+0xf4/0x15c
__device_attach+0x168/0x240
device_initial_probe+0x14/0x20
bus_probe_device+0xec/0x100
device_add+0x55c/0xaf0
mmc_add_card+0x288/0x380
mmc_attach_sd+0x18c/0x22c
mmc_rescan+0x444/0x4f0
process_one_work+0x3b8/0x650
worker_thread+0xa0/0x724
kthread+0x218/0x220
ret_from_fork+0x10/0x38
Freed by task 33:
stack_trace_save+0x9c/0xdc
kasan_save_stack+0x28/0x60
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x180
kasan_slab_free+0x14/0x20
kfree+0xb8/0x46c
mmc_blk_put+0xe4/0x11c
mmc_blk_remove_req.part.0+0x6c/0xe4
mmc_blk_remove+0x368/0x370
mmc_bus_remove+0x34/0x50
__device_release_driver+0x228/0x31c
device_release_driver+0x2c/0x44
bus_remove_device+0x1e4/0x200
device_del+0x2b0/0x770
mmc_remove_card+0xf0/0x150
mmc_sd_detect+0x9c/0x150
mmc_rescan+0x110/0x4f0
process_one_work+0x3b8/0x650
worker_thread+0xa0/0x724
kthread+0x218/0x220
ret_from_fork+0x10/0x38
The buggy address belongs to the object at ffff00000a394800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 552 bytes inside of
1024-byte region [ffff00000a394800, ffff00000a394c00)
The buggy address belongs to the page:
page:00000000ff84ed53 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8a390
head:00000000ff84ed53 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x3fffc0000010200(slab|head)
raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff000009f03800
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00000a394900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000a394980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff00000a394a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff00000a394a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000a394b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Looking closer at the problem, it looks like a classic dangling pointer
bug. The 'struct mmc_blk_data' that is used after being freed in
mmc_blk_put() is stashed away in 'md->disk->private_data' via
mmc_blk_alloc_req() but used in mmc_blk_get() because the 'usage' count
isn't properly aligned with the lifetime of the pointer. You'd expect
the 'usage' member to be in sync with the kfree(), and it mostly is,
except that mmc_blk_get() needs to dereference the potentially freed
memory storage for the 'struct mmc_blk_data' stashed away in the
private_data member to look at 'usage' before it actually figures out if
it wants to consider it a valid pointer or not. That's not going to work
if the freed memory has been overwritten by something else after the
free, and KASAN rightly complains here.
To fix the immediate problem, let's set the private_data member to NULL
in mmc_blk_put() so that mmc_blk_get() can consider the object "on the
way out" if the pointer is NULL and not even try to look at 'usage' if
the object isn't going to be around much longer. With that set to NULL
on the last mmc_blk_put(), optimize the get path further and use a kref
underneath the 'open_lock' mutex to only up the reference count if it's
non-zero, i.e. alive, and otherwise make mmc_blk_get() return NULL,
without actually testing the reference count if we're in the process of
removing the object from the system.
Finally, tighten the locking region on the put side to only be around
the parts that are removing the 'mmc_blk_data' from the system and
publishing that fact to the gendisk and then drop the lock as soon as we
can to avoid holding the lock around code that doesn't need it. This
fixes the KASAN issue.
Cc: Matthias Schiffer <matthias.schiffer(a)ew.tq-group.com>
Cc: Sujit Kautkar <sujitka(a)chromium.org>
Cc: Zubin Mithra <zsm(a)chromium.org>
Reported-by: Ulf Hansson <ulf.hansson(a)linaro.org>
Link: https://lore.kernel.org/linux-mmc/CAPDyKFryT63Jc7+DXWSpAC19qpZRqFr1orxwYGMu… [1]
Signed-off-by: Stephen Boyd <swboyd(a)chromium.org>
Link: https://lore.kernel.org/r/20210623075002.1746924-2-swboyd@chromium.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 9890a1532cb0..ce8aed562929 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -28,6 +28,7 @@
#include <linux/errno.h>
#include <linux/hdreg.h>
#include <linux/kdev_t.h>
+#include <linux/kref.h>
#include <linux/blkdev.h>
#include <linux/cdev.h>
#include <linux/mutex.h>
@@ -111,7 +112,7 @@ struct mmc_blk_data {
#define MMC_BLK_CMD23 (1 << 0) /* Can do SET_BLOCK_COUNT for multiblock */
#define MMC_BLK_REL_WR (1 << 1) /* MMC Reliable write support */
- unsigned int usage;
+ struct kref kref;
unsigned int read_only;
unsigned int part_type;
unsigned int reset_done;
@@ -181,10 +182,8 @@ static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
mutex_lock(&open_lock);
md = disk->private_data;
- if (md && md->usage == 0)
+ if (md && !kref_get_unless_zero(&md->kref))
md = NULL;
- if (md)
- md->usage++;
mutex_unlock(&open_lock);
return md;
@@ -196,18 +195,25 @@ static inline int mmc_get_devidx(struct gendisk *disk)
return devidx;
}
-static void mmc_blk_put(struct mmc_blk_data *md)
+static void mmc_blk_kref_release(struct kref *ref)
{
- mutex_lock(&open_lock);
- md->usage--;
- if (md->usage == 0) {
- int devidx = mmc_get_devidx(md->disk);
+ struct mmc_blk_data *md = container_of(ref, struct mmc_blk_data, kref);
+ int devidx;
- ida_simple_remove(&mmc_blk_ida, devidx);
- put_disk(md->disk);
- kfree(md);
- }
+ devidx = mmc_get_devidx(md->disk);
+ ida_simple_remove(&mmc_blk_ida, devidx);
+
+ mutex_lock(&open_lock);
+ md->disk->private_data = NULL;
mutex_unlock(&open_lock);
+
+ put_disk(md->disk);
+ kfree(md);
+}
+
+static void mmc_blk_put(struct mmc_blk_data *md)
+{
+ kref_put(&md->kref, mmc_blk_kref_release);
}
static ssize_t power_ro_lock_show(struct device *dev,
@@ -2327,7 +2333,8 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
INIT_LIST_HEAD(&md->part);
INIT_LIST_HEAD(&md->rpmbs);
- md->usage = 1;
+ kref_init(&md->kref);
+
md->queue.blkdata = md;
md->disk->major = MMC_BLOCK_MAJOR;
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From edb25572fc7058db5a98223e11d2d50497178553 Mon Sep 17 00:00:00 2001
From: Stephen Boyd <swboyd(a)chromium.org>
Date: Wed, 23 Jun 2021 00:50:01 -0700
Subject: [PATCH] mmc: core: Use kref in place of struct mmc_blk_data::usage
Ulf reported the following KASAN splat after adding some manual hacks
into mmc-utils[1].
DEBUG: mmc_blk_open: Let's sleep for 10s..
mmc1: card 0007 removed
BUG: KASAN: use-after-free in mmc_blk_get+0x58/0xb8
Read of size 4 at addr ffff00000a394a28 by task mmc/180
CPU: 2 PID: 180 Comm: mmc Not tainted 5.10.0-rc4-00069-gcc758c8c7127-dirty #5
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Call trace:
dump_backtrace+0x0/0x2b4
show_stack+0x18/0x6c
dump_stack+0xfc/0x168
print_address_description.constprop.0+0x6c/0x488
kasan_report+0x118/0x210
__asan_load4+0x94/0xd0
mmc_blk_get+0x58/0xb8
mmc_blk_open+0x7c/0xdc
__blkdev_get+0x3b4/0x964
blkdev_get+0x64/0x100
blkdev_open+0xe8/0x104
do_dentry_open+0x234/0x61c
vfs_open+0x54/0x64
path_openat+0xe04/0x1584
do_filp_open+0xe8/0x1e4
do_sys_openat2+0x120/0x230
__arm64_sys_openat+0xf0/0x15c
el0_svc_common.constprop.0+0xac/0x234
do_el0_svc+0x84/0xa0
el0_sync_handler+0x264/0x270
el0_sync+0x174/0x180
Allocated by task 33:
stack_trace_save+0x9c/0xdc
kasan_save_stack+0x28/0x60
__kasan_kmalloc.constprop.0+0xc8/0xf0
kasan_kmalloc+0x10/0x20
mmc_blk_alloc_req+0x94/0x4b0
mmc_blk_probe+0x2d4/0xaa4
mmc_bus_probe+0x34/0x4c
really_probe+0x148/0x6e0
driver_probe_device+0x78/0xec
__device_attach_driver+0x108/0x16c
bus_for_each_drv+0xf4/0x15c
__device_attach+0x168/0x240
device_initial_probe+0x14/0x20
bus_probe_device+0xec/0x100
device_add+0x55c/0xaf0
mmc_add_card+0x288/0x380
mmc_attach_sd+0x18c/0x22c
mmc_rescan+0x444/0x4f0
process_one_work+0x3b8/0x650
worker_thread+0xa0/0x724
kthread+0x218/0x220
ret_from_fork+0x10/0x38
Freed by task 33:
stack_trace_save+0x9c/0xdc
kasan_save_stack+0x28/0x60
kasan_set_track+0x28/0x40
kasan_set_free_info+0x24/0x4c
__kasan_slab_free+0x100/0x180
kasan_slab_free+0x14/0x20
kfree+0xb8/0x46c
mmc_blk_put+0xe4/0x11c
mmc_blk_remove_req.part.0+0x6c/0xe4
mmc_blk_remove+0x368/0x370
mmc_bus_remove+0x34/0x50
__device_release_driver+0x228/0x31c
device_release_driver+0x2c/0x44
bus_remove_device+0x1e4/0x200
device_del+0x2b0/0x770
mmc_remove_card+0xf0/0x150
mmc_sd_detect+0x9c/0x150
mmc_rescan+0x110/0x4f0
process_one_work+0x3b8/0x650
worker_thread+0xa0/0x724
kthread+0x218/0x220
ret_from_fork+0x10/0x38
The buggy address belongs to the object at ffff00000a394800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 552 bytes inside of
1024-byte region [ffff00000a394800, ffff00000a394c00)
The buggy address belongs to the page:
page:00000000ff84ed53 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8a390
head:00000000ff84ed53 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x3fffc0000010200(slab|head)
raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff000009f03800
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff00000a394900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000a394980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff00000a394a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff00000a394a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff00000a394b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Looking closer at the problem, it looks like a classic dangling pointer
bug. The 'struct mmc_blk_data' that is used after being freed in
mmc_blk_put() is stashed away in 'md->disk->private_data' via
mmc_blk_alloc_req() but used in mmc_blk_get() because the 'usage' count
isn't properly aligned with the lifetime of the pointer. You'd expect
the 'usage' member to be in sync with the kfree(), and it mostly is,
except that mmc_blk_get() needs to dereference the potentially freed
memory storage for the 'struct mmc_blk_data' stashed away in the
private_data member to look at 'usage' before it actually figures out if
it wants to consider it a valid pointer or not. That's not going to work
if the freed memory has been overwritten by something else after the
free, and KASAN rightly complains here.
To fix the immediate problem, let's set the private_data member to NULL
in mmc_blk_put() so that mmc_blk_get() can consider the object "on the
way out" if the pointer is NULL and not even try to look at 'usage' if
the object isn't going to be around much longer. With that set to NULL
on the last mmc_blk_put(), optimize the get path further and use a kref
underneath the 'open_lock' mutex to only up the reference count if it's
non-zero, i.e. alive, and otherwise make mmc_blk_get() return NULL,
without actually testing the reference count if we're in the process of
removing the object from the system.
Finally, tighten the locking region on the put side to only be around
the parts that are removing the 'mmc_blk_data' from the system and
publishing that fact to the gendisk and then drop the lock as soon as we
can to avoid holding the lock around code that doesn't need it. This
fixes the KASAN issue.
Cc: Matthias Schiffer <matthias.schiffer(a)ew.tq-group.com>
Cc: Sujit Kautkar <sujitka(a)chromium.org>
Cc: Zubin Mithra <zsm(a)chromium.org>
Reported-by: Ulf Hansson <ulf.hansson(a)linaro.org>
Link: https://lore.kernel.org/linux-mmc/CAPDyKFryT63Jc7+DXWSpAC19qpZRqFr1orxwYGMu… [1]
Signed-off-by: Stephen Boyd <swboyd(a)chromium.org>
Link: https://lore.kernel.org/r/20210623075002.1746924-2-swboyd@chromium.org
Cc: stable(a)vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 9890a1532cb0..ce8aed562929 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -28,6 +28,7 @@
#include <linux/errno.h>
#include <linux/hdreg.h>
#include <linux/kdev_t.h>
+#include <linux/kref.h>
#include <linux/blkdev.h>
#include <linux/cdev.h>
#include <linux/mutex.h>
@@ -111,7 +112,7 @@ struct mmc_blk_data {
#define MMC_BLK_CMD23 (1 << 0) /* Can do SET_BLOCK_COUNT for multiblock */
#define MMC_BLK_REL_WR (1 << 1) /* MMC Reliable write support */
- unsigned int usage;
+ struct kref kref;
unsigned int read_only;
unsigned int part_type;
unsigned int reset_done;
@@ -181,10 +182,8 @@ static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
mutex_lock(&open_lock);
md = disk->private_data;
- if (md && md->usage == 0)
+ if (md && !kref_get_unless_zero(&md->kref))
md = NULL;
- if (md)
- md->usage++;
mutex_unlock(&open_lock);
return md;
@@ -196,18 +195,25 @@ static inline int mmc_get_devidx(struct gendisk *disk)
return devidx;
}
-static void mmc_blk_put(struct mmc_blk_data *md)
+static void mmc_blk_kref_release(struct kref *ref)
{
- mutex_lock(&open_lock);
- md->usage--;
- if (md->usage == 0) {
- int devidx = mmc_get_devidx(md->disk);
+ struct mmc_blk_data *md = container_of(ref, struct mmc_blk_data, kref);
+ int devidx;
- ida_simple_remove(&mmc_blk_ida, devidx);
- put_disk(md->disk);
- kfree(md);
- }
+ devidx = mmc_get_devidx(md->disk);
+ ida_simple_remove(&mmc_blk_ida, devidx);
+
+ mutex_lock(&open_lock);
+ md->disk->private_data = NULL;
mutex_unlock(&open_lock);
+
+ put_disk(md->disk);
+ kfree(md);
+}
+
+static void mmc_blk_put(struct mmc_blk_data *md)
+{
+ kref_put(&md->kref, mmc_blk_kref_release);
}
static ssize_t power_ro_lock_show(struct device *dev,
@@ -2327,7 +2333,8 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
INIT_LIST_HEAD(&md->part);
INIT_LIST_HEAD(&md->rpmbs);
- md->usage = 1;
+ kref_init(&md->kref);
+
md->queue.blkdata = md;
md->disk->major = MMC_BLOCK_MAJOR;
[ Upstream commit 997135017716 ("io_uring: Fix race condition when sqp thread goes to sleep") ]
If an asynchronous completion happens before the task is preparing
itself to wait and set its state to TASK_INTERRUPTIBLE, the completion
will not wake up the sqp thread.
Signed-off-by: Olivier Langlois <olivier(a)trillion01.com>
---
fs/io_uring.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index eeea6b8c8bee..a7f1cbd7be9a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6876,7 +6876,8 @@ static int io_sq_thread(void *data)
}
prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE);
- if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) {
+ if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state) &&
+ !io_run_task_work()) {
list_for_each_entry(ctx, &sqd->ctx_list, sqd_list)
io_ring_set_wakeup_flag(ctx);
--
2.32.0
On Mon, Jul 26, 2021 at 02:04:17AM -0500, Alephпτ 1 wrote:
> I tried sending this as text with the commands in the email but it rejected
> it because of the code.
>
> Hopefully this screenshot works
screenshots can not be seen with text email readers :(
On Mon, Jul 26, 2021 at 5:44 AM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> ACPI: utils: Fix reference counting in for_each_acpi_dev_match()
>
> to the 5.13-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> acpi-utils-fix-reference-counting-in-for_each_acpi_d.patch
> and it can be found in the queue-5.13 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
This has to be accompanied with
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
> commit 9b33ed3a4be3985e11c6bc81e5789262794164b6
> Author: Andy Shevchenko <andy.shevchenko(a)gmail.com>
> Date: Mon Jul 12 21:21:21 2021 +0300
--
With Best Regards,
Andy Shevchenko
Hi,
This patch is not up to date.
The following revert and commit are the latest.
https://patchwork.kernel.org/project/spi-devel-general/patch/bd30bdb4-07c4-…https://patchwork.kernel.org/project/spi-devel-general/patch/92eea403-9b21-…
Please replace it with the latest patch.
Best Regards,
On 2021/07/26 11:44, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> spi: spi-cadence-quadspi: Fix division by zero warning
>
> to the 5.13-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> spi-spi-cadence-quadspi-fix-division-by-zero-warning.patch
> and it can be found in the queue-5.13 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 8e7f9650f5d883bca7b0239529c1b704673abd38
> Author: Yoshitaka Ikeda <ikeda(a)nskint.co.jp>
> Date: Thu Jul 15 16:21:32 2021 +0000
>
> spi: spi-cadence-quadspi: Fix division by zero warning
>
> [ Upstream commit 55cef88bbf12f3bfbe5c2379a8868a034707e755 ]
>
> Fix below division by zero warning:
> - Added an if statement because buswidth can be zero, resulting in division by zero.
> - The modified code was based on another driver (atmel-quadspi).
>
> [ 0.795337] Division by zero in kernel.
> :
> [ 0.834051] [<807fd40c>] (__div0) from [<804e1acc>] (Ldiv0+0x8/0x10)
> [ 0.839097] [<805f0710>] (cqspi_exec_mem_op) from [<805edb4c>] (spi_mem_exec_op+0x3b0/0x3f8)
>
> Fixes: 7512eaf54190 ("spi: cadence-quadspi: Fix dummy cycle calculation when buswidth > 1")
> Signed-off-by: Yoshitaka Ikeda <ikeda(a)nskint.co.jp>
> Link: https://lore.kernel.org/r/ed989af6-da88-4e0b-9ed8-126db6cad2e4@nskint.co.jp
> Signed-off-by: Mark Brown <broonie(a)kernel.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index 7a00346ff9b9..13d1f0ce618e 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -307,11 +307,13 @@ static unsigned int cqspi_calc_rdreg(struct cqspi_flash_pdata *f_pdata)
>
> static unsigned int cqspi_calc_dummy(const struct spi_mem_op *op, bool dtr)
> {
> - unsigned int dummy_clk;
> + unsigned int dummy_clk = 0;
>
> - dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth);
> - if (dtr)
> - dummy_clk /= 2;
> + if (op->dummy.buswidth && op->dummy.nbytes) {
> + dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth);
> + if (dtr)
> + dummy_clk /= 2;
> + }
>
> return dummy_clk;
> }
>
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: fix mount mode command line processing
In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of
the mount mode string was changed from match_octal() to fsparam_u32. This
changed existing behavior as match_octal does not require octal values to
have a '0' prefix, but fsparam_u32 does.
Use fsparam_u32oct which provides the same behavior as match_octal.
Link: https://lkml.kernel.org/r/20210721183326.102716-1-mike.kravetz@oracle.com
Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Dennis Camera <bugs+kernel.org(a)dtnr.ch>
Reviewed-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c~hugetlbfs-fix-mount-mode-command-line-processing
+++ a/fs/hugetlbfs/inode.c
@@ -77,7 +77,7 @@ enum hugetlb_param {
static const struct fs_parameter_spec hugetlb_fs_parameters[] = {
fsparam_u32 ("gid", Opt_gid),
fsparam_string("min_size", Opt_min_size),
- fsparam_u32 ("mode", Opt_mode),
+ fsparam_u32oct("mode", Opt_mode),
fsparam_string("nr_inodes", Opt_nr_inodes),
fsparam_string("pagesize", Opt_pagesize),
fsparam_string("size", Opt_size),
_
From: Stephane Grosjean <s.grosjean(a)peak-system.com>
This patch fixes an incorrect way of reading error counters in messages
received for this purpose from the PCAN-USB interface. These messages
inform about the increase or decrease of the error counters, whose values
are placed in bytes 1 and 2 of the message data (not 0 and 1).
Fixes: ea8b33bde76c ("can: pcan_usb: add support of rxerr/txerr counters")
Link: https://lore.kernel.org/r/20210625130931.27438-4-s.grosjean@peak-system.com
Cc: linux-stable <stable(a)vger.kernel.org>
Signed-off-by: Stephane Grosjean <s.grosjean(a)peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
drivers/net/can/usb/peak_usb/pcan_usb.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c
index 1d6f77252f01..899a3d21b77f 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb.c
@@ -117,7 +117,8 @@
#define PCAN_USB_BERR_MASK (PCAN_USB_ERR_RXERR | PCAN_USB_ERR_TXERR)
/* identify bus event packets with rx/tx error counters */
-#define PCAN_USB_ERR_CNT 0x80
+#define PCAN_USB_ERR_CNT_DEC 0x00 /* counters are decreasing */
+#define PCAN_USB_ERR_CNT_INC 0x80 /* counters are increasing */
/* private to PCAN-USB adapter */
struct pcan_usb {
@@ -608,11 +609,12 @@ static int pcan_usb_handle_bus_evt(struct pcan_usb_msg_context *mc, u8 ir)
/* acccording to the content of the packet */
switch (ir) {
- case PCAN_USB_ERR_CNT:
+ case PCAN_USB_ERR_CNT_DEC:
+ case PCAN_USB_ERR_CNT_INC:
/* save rx/tx error counters from in the device context */
- pdev->bec.rxerr = mc->ptr[0];
- pdev->bec.txerr = mc->ptr[1];
+ pdev->bec.rxerr = mc->ptr[1];
+ pdev->bec.txerr = mc->ptr[2];
break;
default:
--
2.30.2
From: Christoph Hellwig <hch(a)lst.de>
Subject: mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
memcpy_to_page and memzero_page can write to arbitrary pages, which could
be in the page cache or in high memory, so call flush_kernel_dcache_pages
to flush the dcache.
This is a problem when using these helpers on dcache challeneged
architectures. Right now there are just a few users, chances are no
one used the PC floppy dr\u0456ver, the aha1542 driver for an ISA SCSI
HBA, and a few advanced and optional btrfs and ext4 features on those
platforms yet since the conversion.
Link: https://lkml.kernel.org/r/20210713055231.137602-2-hch@lst.de
Fixes: bb90d4bc7b6a ("mm/highmem: Lift memcpy_[to|from]_page to core")
Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Ira Weiny <ira.weiny(a)intel.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/highmem.h | 2 ++
1 file changed, 2 insertions(+)
--- a/include/linux/highmem.h~mm-call-flush_dcache_page-in-memcpy_to_page-and-memzero_page
+++ a/include/linux/highmem.h
@@ -318,6 +318,7 @@ static inline void memcpy_to_page(struct
VM_BUG_ON(offset + len > PAGE_SIZE);
memcpy(to + offset, from, len);
+ flush_dcache_page(page);
kunmap_local(to);
}
@@ -325,6 +326,7 @@ static inline void memzero_page(struct p
{
char *addr = kmap_atomic(page);
memset(addr + offset, 0, len);
+ flush_dcache_page(page);
kunmap_atomic(addr);
}
_
From: Alexander Potapenko <glider(a)google.com>
Subject: kfence: skip all GFP_ZONEMASK allocations
Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot
be fulfilled by KFENCE, because KFENCE memory pool is located in a zone
different from the requested one.
Because callers of kmem_cache_alloc() may actually rely on the allocation
to reside in the requested zone (e.g. memory allocations done with
__GFP_DMA must be DMAable), skip all allocations done with GFP_ZONEMASK
and/or respective SLAB flags (SLAB_CACHE_DMA and SLAB_CACHE_DMA32).
Link: https://lkml.kernel.org/r/20210714092222.1890268-2-glider@google.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
Acked-by: Souptick Joarder <jrdr.linux(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Souptick Joarder <jrdr.linux(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kfence/core.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/mm/kfence/core.c~kfence-skip-all-gfp_zonemask-allocations
+++ a/mm/kfence/core.c
@@ -741,6 +741,15 @@ void *__kfence_alloc(struct kmem_cache *
return NULL;
/*
+ * Skip allocations from non-default zones, including DMA. We cannot
+ * guarantee that pages in the KFENCE pool will have the requested
+ * properties (e.g. reside in DMAable memory).
+ */
+ if ((flags & GFP_ZONEMASK) ||
+ (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32)))
+ return NULL;
+
+ /*
* allocation_gate only needs to become non-zero, so it doesn't make
* sense to continue writing to it and pay the associated contention
* cost, in case we have a large number of concurrent allocations.
_
From: Alexander Potapenko <glider(a)google.com>
Subject: kfence: move the size check to the beginning of __kfence_alloc()
Check the allocation size before toggling kfence_allocation_gate. This
way allocations that can't be served by KFENCE will not result in waiting
for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating anything.
Link: https://lkml.kernel.org/r/20210714092222.1890268-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Suggested-by: Marco Elver <elver(a)google.com>
Reviewed-by: Marco Elver <elver(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Marco Elver <elver(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kfence/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/mm/kfence/core.c~kfence-move-the-size-check-to-the-beginning-of-__kfence_alloc
+++ a/mm/kfence/core.c
@@ -734,6 +734,13 @@ void kfence_shutdown_cache(struct kmem_c
void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
{
/*
+ * Perform size check before switching kfence_allocation_gate, so that
+ * we don't disable KFENCE without making an allocation.
+ */
+ if (size > PAGE_SIZE)
+ return NULL;
+
+ /*
* allocation_gate only needs to become non-zero, so it doesn't make
* sense to continue writing to it and pay the associated contention
* cost, in case we have a large number of concurrent allocations.
@@ -757,9 +764,6 @@ void *__kfence_alloc(struct kmem_cache *
if (!READ_ONCE(kfence_enabled))
return NULL;
- if (size > PAGE_SIZE)
- return NULL;
-
return kfence_guarded_alloc(s, size, flags);
}
_
From: Peter Collingbourne <pcc(a)google.com>
Subject: selftest: use mmap instead of posix_memalign to allocate memory
This test passes pointers obtained from anon_allocate_area to the
userfaultfd and mremap APIs. This causes a problem if the system
allocator returns tagged pointers because with the tagged address ABI the
kernel rejects tagged addresses passed to these APIs, which would end up
causing the test to fail. To make this test compatible with such system
allocators, stop using the system allocator to allocate memory in
anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com
Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5…
Fixes: c47174fc362a ("userfaultfd: selftest")
Co-developed-by: Lokesh Gidra <lokeshgidra(a)google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra(a)google.com>
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: Dave Martin <Dave.Martin(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Alistair Delva <adelva(a)google.com>
Cc: William McVicker <willmcvicker(a)google.com>
Cc: Evgenii Stepanov <eugenis(a)google.com>
Cc: Mitch Phillips <mitchp(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [5.4]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/vm/userfaultfd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/userfaultfd.c~selftest-use-mmap-instead-of-posix_memalign-to-allocate-memory
+++ a/tools/testing/selftests/vm/userfaultfd.c
@@ -210,8 +210,10 @@ static void anon_release_pages(char *rel
static void anon_allocate_area(void **alloc_area)
{
- if (posix_memalign(alloc_area, page_size, nr_pages * page_size))
- err("posix_memalign() failed");
+ *alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+ if (*alloc_area == MAP_FAILED)
+ err("mmap of anonymous memory failed");
}
static void noop_alias_mapping(__u64 *start, size_t len, unsigned long offset)
_
From: Peter Collingbourne <pcc(a)google.com>
Subject: userfaultfd: do not untag user pointers
Patch series "userfaultfd: do not untag user pointers", v5.
If a user program uses userfaultfd on ranges of heap memory, it may end up
passing a tagged pointer to the kernel in the range.start field of the
UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].
When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg. However, from the application's perspective, the tagged address
*is* the memory address, so if the application is unaware of memory tags,
it may get confused by receiving an address that is, from its point of
view, outside of the bounds of the allocation. We observed this behavior
in the kselftest for userfaultfd [2] but other applications could have the
same problem.
Address this by not untagging pointers passed to the userfaultfd ioctls.
Instead, let the system call fail. Also change the kselftest to use mmap
so that it doesn't encounter this problem.
[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c
This patch (of 2):
If a user program uses userfaultfd on ranges of heap memory, it may end up
passing a tagged pointer to the kernel in the range.start field of the
UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].
When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg. However, from the application's perspective, the tagged address
*is* the memory address, so if the application is unaware of memory tags,
it may get confused by receiving an address that is, from its point of
view, outside of the bounds of the allocation. We observed this behavior
in the kselftest for userfaultfd [2] but other applications could have the
same problem.
Address this by not untagging pointers passed to the userfaultfd ioctls.
Instead, let the system call fail. This will provide an early indication
of problems with tag-unaware userspace code instead of letting the code
get confused later, and is consistent with how we decided to handle
brk/mmap/mremap in commit dcde237319e6 ("mm: Avoid creating virtual
address aliases in brk()/mmap()/mremap()"), as well as being consistent
with the existing tagged address ABI documentation relating to how ioctl
arguments are handled.
The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag
user pointers") plus some fixups to some additional calls to
validate_range that have appeared since then.
[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c
Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com
Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0…
Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Signed-off-by: Peter Collingbourne <pcc(a)google.com>
Reviewed-by: Andrey Konovalov <andreyknvl(a)gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Alistair Delva <adelva(a)google.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Dave Martin <Dave.Martin(a)arm.com>
Cc: Evgenii Stepanov <eugenis(a)google.com>
Cc: Lokesh Gidra <lokeshgidra(a)google.com>
Cc: Mitch Phillips <mitchp(a)google.com>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: William McVicker <willmcvicker(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.4]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
Documentation/arm64/tagged-address-abi.rst | 26 +++++++++++++------
fs/userfaultfd.c | 26 ++++++++-----------
2 files changed, 30 insertions(+), 22 deletions(-)
--- a/Documentation/arm64/tagged-address-abi.rst~userfaultfd-do-not-untag-user-pointers
+++ a/Documentation/arm64/tagged-address-abi.rst
@@ -45,14 +45,24 @@ how the user addresses are used by the k
1. User addresses not accessed by the kernel but used for address space
management (e.g. ``mprotect()``, ``madvise()``). The use of valid
- tagged pointers in this context is allowed with the exception of
- ``brk()``, ``mmap()`` and the ``new_address`` argument to
- ``mremap()`` as these have the potential to alias with existing
- user addresses.
-
- NOTE: This behaviour changed in v5.6 and so some earlier kernels may
- incorrectly accept valid tagged pointers for the ``brk()``,
- ``mmap()`` and ``mremap()`` system calls.
+ tagged pointers in this context is allowed with these exceptions:
+
+ - ``brk()``, ``mmap()`` and the ``new_address`` argument to
+ ``mremap()`` as these have the potential to alias with existing
+ user addresses.
+
+ NOTE: This behaviour changed in v5.6 and so some earlier kernels may
+ incorrectly accept valid tagged pointers for the ``brk()``,
+ ``mmap()`` and ``mremap()`` system calls.
+
+ - The ``range.start``, ``start`` and ``dst`` arguments to the
+ ``UFFDIO_*`` ``ioctl()``s used on a file descriptor obtained from
+ ``userfaultfd()``, as fault addresses subsequently obtained by reading
+ the file descriptor will be untagged, which may otherwise confuse
+ tag-unaware programs.
+
+ NOTE: This behaviour changed in v5.14 and so some earlier kernels may
+ incorrectly accept valid tagged pointers for this system call.
2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
relaxation is disabled by default and the application thread needs to
--- a/fs/userfaultfd.c~userfaultfd-do-not-untag-user-pointers
+++ a/fs/userfaultfd.c
@@ -1236,23 +1236,21 @@ static __always_inline void wake_userfau
}
static __always_inline int validate_range(struct mm_struct *mm,
- __u64 *start, __u64 len)
+ __u64 start, __u64 len)
{
__u64 task_size = mm->task_size;
- *start = untagged_addr(*start);
-
- if (*start & ~PAGE_MASK)
+ if (start & ~PAGE_MASK)
return -EINVAL;
if (len & ~PAGE_MASK)
return -EINVAL;
if (!len)
return -EINVAL;
- if (*start < mmap_min_addr)
+ if (start < mmap_min_addr)
return -EINVAL;
- if (*start >= task_size)
+ if (start >= task_size)
return -EINVAL;
- if (len > task_size - *start)
+ if (len > task_size - start)
return -EINVAL;
return 0;
}
@@ -1316,7 +1314,7 @@ static int userfaultfd_register(struct u
vm_flags |= VM_UFFD_MINOR;
}
- ret = validate_range(mm, &uffdio_register.range.start,
+ ret = validate_range(mm, uffdio_register.range.start,
uffdio_register.range.len);
if (ret)
goto out;
@@ -1522,7 +1520,7 @@ static int userfaultfd_unregister(struct
if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister)))
goto out;
- ret = validate_range(mm, &uffdio_unregister.start,
+ ret = validate_range(mm, uffdio_unregister.start,
uffdio_unregister.len);
if (ret)
goto out;
@@ -1671,7 +1669,7 @@ static int userfaultfd_wake(struct userf
if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake)))
goto out;
- ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len);
+ ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
if (ret)
goto out;
@@ -1711,7 +1709,7 @@ static int userfaultfd_copy(struct userf
sizeof(uffdio_copy)-sizeof(__s64)))
goto out;
- ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len);
+ ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
if (ret)
goto out;
/*
@@ -1768,7 +1766,7 @@ static int userfaultfd_zeropage(struct u
sizeof(uffdio_zeropage)-sizeof(__s64)))
goto out;
- ret = validate_range(ctx->mm, &uffdio_zeropage.range.start,
+ ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
uffdio_zeropage.range.len);
if (ret)
goto out;
@@ -1818,7 +1816,7 @@ static int userfaultfd_writeprotect(stru
sizeof(struct uffdio_writeprotect)))
return -EFAULT;
- ret = validate_range(ctx->mm, &uffdio_wp.range.start,
+ ret = validate_range(ctx->mm, uffdio_wp.range.start,
uffdio_wp.range.len);
if (ret)
return ret;
@@ -1866,7 +1864,7 @@ static int userfaultfd_continue(struct u
sizeof(uffdio_continue) - (sizeof(__s64))))
goto out;
- ret = validate_range(ctx->mm, &uffdio_continue.range.start,
+ ret = validate_range(ctx->mm, uffdio_continue.range.start,
uffdio_continue.range.len);
if (ret)
goto out;
_
Hello there,
I am Jotham Masuku, a broker working with FlippieBecker Wealth
Services in SA. I am contacting you because one of my high
profile clients is interested in investing abroad and has asked
me to look for individuals and companies with interesting
business ideas and companies that he can invest in.
He wants to expand his portfolio and has interest in investing a
substantial amount of asset abroad. I got your email contact
through an online b2b business directory and I thought I'd
contact you to see if you are interested in this opportunity.
Please indicate your interest by replying back to this email.
Once I get your response, I will give you more details and we can
plan a strategy that will be beneficial to all parties.
Best Regards,
J Masuku
FippieBecker Wealth Services
Commit 7930742d6, reverting 26fd962, missed out on reverting an incorrect
change to a return value. The niu_pci_vpd_scan_props(..) == 1 case appears
to be a normal path - treating it as an error and return -EINVAL was
breaking VPD_SCAN and causing the driver to fail to load.
Fix, so my Neptune card works again.
Cc: Kangjie Lu <kjlu(a)umn.edu>
Cc: Shannon Nelson <shannon.lee.nelson(a)gmail.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable <stable(a)vger.kernel.org>
Fixes: 7930742d ('Revert "niu: fix missing checks of niu_pci_eeprom_read"')
Signed-off-by: Paul Jakma <paul(a)jakma.org>
---
drivers/net/ethernet/sun/niu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 74e748662ec0..860644d182ab 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -8191,8 +8191,9 @@ static int niu_pci_vpd_fetch(struct niu *np, u32 start)
err = niu_pci_vpd_scan_props(np, here, end);
if (err < 0)
return err;
+ /* ret == 1 is not an error */
if (err == 1)
- return -EINVAL;
+ return 0;
}
return 0;
}
--
2.31.1
A wwan link created via the wwan_create_default_link procedure is
never notified to the user (RTM_NEWLINK), causing issues with user
tools relying on such event to track network links (NetworkManager).
This is because the procedure misses a call to rtnl_configure_link(),
which sets the link as initialized and notifies the new link (cf
proper usage in __rtnl_newlink()).
Cc: stable(a)vger.kernel.org
Fixes: ca374290aaad ("wwan: core: support default netdev creation")
Suggested-by: Sergey Ryazanov <ryazanov.s.a(a)gmail.com>
Signed-off-by: Loic Poulain <loic.poulain(a)linaro.org>
---
drivers/net/wwan/wwan_core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c
index 3e16c31..674a81d 100644
--- a/drivers/net/wwan/wwan_core.c
+++ b/drivers/net/wwan/wwan_core.c
@@ -984,6 +984,8 @@ static void wwan_create_default_link(struct wwan_device *wwandev,
goto unlock;
}
+ rtnl_configure_link(dev, NULL); /* Link initialized, notify new link */
+
unlock:
rtnl_unlock();
--
2.7.4
Commit 7930742d6, reverting 26fd962, missed out on reverting an incorrect
change to a return value. The niu_pci_vpd_scan_props(..) == 1 case appears
to be a normal path - treating it as an error and return -EINVAL was
breaking VPD_SCAN and causing the driver to fail to load.
Fix, so my Neptune card works again.
Cc: Kangjie Lu <kjlu(a)umn.edu>
Cc: Shannon Nelson <shannon.lee.nelson(a)gmail.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable <stable(a)vger.kernel.org>
Fixes: 7930742d ('Revert "niu: fix missing checks of niu_pci_eeprom_read"')
Signed-off-by: Paul Jakma <paul(a)jakma.org>
---
drivers/net/ethernet/sun/niu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 74e748662ec0..860644d182ab 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -8191,8 +8191,9 @@ static int niu_pci_vpd_fetch(struct niu *np, u32 start)
err = niu_pci_vpd_scan_props(np, here, end);
if (err < 0)
return err;
+ /* ret == 1 is not an error */
if (err == 1)
- return -EINVAL;
+ return 0;
}
return 0;
}
--
2.31.1
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Because of the significant overhead that retpolines pose on indirect
calls, the tracepoint code was updated to use the new "static_calls" that
can modify the running code to directly call a function instead of using
an indirect caller, and this function can be changed at runtime.
In the tracepoint code that calls all the registered callbacks that are
attached to a tracepoint, the following is done:
it_func_ptr = rcu_dereference_raw((&__tracepoint_##name)->funcs);
if (it_func_ptr) {
__data = (it_func_ptr)->data;
static_call(tp_func_##name)(__data, args);
}
If there's just a single callback, the static_call is updated to just call
that callback directly. Once another handler is added, then the static
caller is updated to call the iterator, that simply loops over all the
funcs in the array and calls each of the callbacks like the old method
using indirect calling.
The issue was discovered with a race between updating the funcs array and
updating the static_call. The funcs array was updated first and then the
static_call was updated. This is not an issue as long as the first element
in the old array is the same as the first element in the new array. But
that assumption is incorrect, because callbacks also have a priority
field, and if there's a callback added that has a higher priority than the
callback on the old array, then it will become the first callback in the
new array. This means that it is possible to call the old callback with
the new callback data element, which can cause a kernel panic.
static_call = callback1()
funcs[] = {callback1,data1};
callback2 has higher priority than callback1
CPU 1 CPU 2
----- -----
new_funcs = {callback2,data2},
{callback1,data1}
rcu_assign_pointer(tp->funcs, new_funcs);
/*
* Now tp->funcs has the new array
* but the static_call still calls callback1
*/
it_func_ptr = tp->funcs [ new_funcs ]
data = it_func_ptr->data [ data2 ]
static_call(callback1, data);
/* Now callback1 is called with
* callback2's data */
[ KERNEL PANIC ]
update_static_call(iterator);
To prevent this from happening, always switch the static_call to the
iterator before assigning the tp->funcs to the new array. The iterator will
always properly match the callback with its data.
To trigger this bug:
In one terminal:
while :; do hackbench 50; done
In another terminal
echo 1 > /sys/kernel/tracing/events/sched/sched_waking/enable
while :; do
echo 1 > /sys/kernel/tracing/set_event_pid;
sleep 0.5
echo 0 > /sys/kernel/tracing/set_event_pid;
sleep 0.5
done
And it doesn't take long to crash. This is because the set_event_pid adds
a callback to the sched_waking tracepoint with a high priority, which will
be called before the sched_waking trace event callback is called.
Note, the removal to a single callback updates the array first, before
changing the static_call to single callback, which is the proper order as
the first element in the array is the same as what the static_call is
being changed to.
Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba…
Cc: stable(a)vger.kernel.org
Fixes: d25e37d89dd2f ("tracepoint: Optimize using static_call()")
Reported-by: Stefan Metzmacher <metze(a)samba.org>
tested-by: Stefan Metzmacher <metze(a)samba.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/tracepoint.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 976bf8ce8039..fc32821f8240 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -299,8 +299,8 @@ static int tracepoint_add_func(struct tracepoint *tp,
* a pointer to it. This array is referenced by __DO_TRACE from
* include/linux/tracepoint.h using rcu_dereference_sched().
*/
- rcu_assign_pointer(tp->funcs, tp_funcs);
tracepoint_update_call(tp, tp_funcs, false);
+ rcu_assign_pointer(tp->funcs, tp_funcs);
static_key_enable(&tp->key);
release_probes(old);
--
2.30.2
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Currently the histogram logic allows the user to write "cpu" in as an
event field, and it will record the CPU that the event happened on.
The problem with this is that there's a lot of events that have "cpu"
as a real field, and using "cpu" as the CPU it ran on, makes it
impossible to run histograms on the "cpu" field of events.
For example, if I want to have a histogram on the count of the
workqueue_queue_work event on its cpu field, running:
># echo 'hist:keys=cpu' > events/workqueue/workqueue_queue_work/trigger
Gives a misleading and wrong result.
Change the command to "common_cpu" as no event should have "common_*"
fields as that's a reserved name for fields used by all events. And
this makes sense here as common_cpu would be a field used by all events.
Now we can even do:
># echo 'hist:keys=common_cpu,cpu if cpu < 100' > events/workqueue/workqueue_queue_work/trigger
># cat events/workqueue/workqueue_queue_work/hist
# event histogram
#
# trigger info: hist:keys=common_cpu,cpu:vals=hitcount:sort=hitcount:size=2048 if cpu < 100 [active]
#
{ common_cpu: 0, cpu: 2 } hitcount: 1
{ common_cpu: 0, cpu: 4 } hitcount: 1
{ common_cpu: 7, cpu: 7 } hitcount: 1
{ common_cpu: 0, cpu: 7 } hitcount: 1
{ common_cpu: 0, cpu: 1 } hitcount: 1
{ common_cpu: 0, cpu: 6 } hitcount: 2
{ common_cpu: 0, cpu: 5 } hitcount: 2
{ common_cpu: 1, cpu: 1 } hitcount: 4
{ common_cpu: 6, cpu: 6 } hitcount: 4
{ common_cpu: 5, cpu: 5 } hitcount: 14
{ common_cpu: 4, cpu: 4 } hitcount: 26
{ common_cpu: 0, cpu: 0 } hitcount: 39
{ common_cpu: 2, cpu: 2 } hitcount: 184
Now for backward compatibility, I added a trick. If "cpu" is used, and
the field is not found, it will fall back to "common_cpu" and work as
it did before. This way, it will still work for old programs that use
"cpu" to get the actual CPU, but if the event has a "cpu" as a field, it
will get that event's "cpu" field, which is probably what it wants
anyway.
I updated the tracefs/README to include documentation about both the
common_timestamp and the common_cpu. This way, if that text is present in
the README, then an application can know that common_cpu is supported over
just plain "cpu".
Link: https://lkml.kernel.org/r/20210721110053.26b4f641@oasis.local.home
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org
Fixes: 8b7622bf94a44 ("tracing: Add cpu field for hist triggers")
Reviewed-by: Tom Zanussi <zanussi(a)kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
Documentation/trace/histogram.rst | 2 +-
kernel/trace/trace.c | 4 ++++
kernel/trace/trace_events_hist.c | 22 ++++++++++++++++------
3 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst
index b71e09f745c3..f99be8062bc8 100644
--- a/Documentation/trace/histogram.rst
+++ b/Documentation/trace/histogram.rst
@@ -191,7 +191,7 @@ Documentation written by Tom Zanussi
with the event, in nanoseconds. May be
modified by .usecs to have timestamps
interpreted as microseconds.
- cpu int the cpu on which the event occurred.
+ common_cpu int the cpu on which the event occurred.
====================== ==== =======================================
Extended error information
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index f8b80b5bab71..c59dd35a6da5 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5609,6 +5609,10 @@ static const char readme_msg[] =
"\t [:name=histname1]\n"
"\t [:<handler>.<action>]\n"
"\t [if <filter>]\n\n"
+ "\t Note, special fields can be used as well:\n"
+ "\t common_timestamp - to record current timestamp\n"
+ "\t common_cpu - to record the CPU the event happened on\n"
+ "\n"
"\t When a matching event is hit, an entry is added to a hash\n"
"\t table using the key(s) and value(s) named, and the value of a\n"
"\t sum called 'hitcount' is incremented. Keys and values\n"
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 16a9dfc9fffc..34325f41ebc0 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1111,7 +1111,7 @@ static const char *hist_field_name(struct hist_field *field,
field->flags & HIST_FIELD_FL_ALIAS)
field_name = hist_field_name(field->operands[0], ++level);
else if (field->flags & HIST_FIELD_FL_CPU)
- field_name = "cpu";
+ field_name = "common_cpu";
else if (field->flags & HIST_FIELD_FL_EXPR ||
field->flags & HIST_FIELD_FL_VAR_REF) {
if (field->system) {
@@ -1991,14 +1991,24 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
hist_data->enable_timestamps = true;
if (*flags & HIST_FIELD_FL_TIMESTAMP_USECS)
hist_data->attrs->ts_in_usecs = true;
- } else if (strcmp(field_name, "cpu") == 0)
+ } else if (strcmp(field_name, "common_cpu") == 0)
*flags |= HIST_FIELD_FL_CPU;
else {
field = trace_find_event_field(file->event_call, field_name);
if (!field || !field->size) {
- hist_err(tr, HIST_ERR_FIELD_NOT_FOUND, errpos(field_name));
- field = ERR_PTR(-EINVAL);
- goto out;
+ /*
+ * For backward compatibility, if field_name
+ * was "cpu", then we treat this the same as
+ * common_cpu.
+ */
+ if (strcmp(field_name, "cpu") == 0) {
+ *flags |= HIST_FIELD_FL_CPU;
+ } else {
+ hist_err(tr, HIST_ERR_FIELD_NOT_FOUND,
+ errpos(field_name));
+ field = ERR_PTR(-EINVAL);
+ goto out;
+ }
}
}
out:
@@ -5085,7 +5095,7 @@ static void hist_field_print(struct seq_file *m, struct hist_field *hist_field)
seq_printf(m, "%s=", hist_field->var.name);
if (hist_field->flags & HIST_FIELD_FL_CPU)
- seq_puts(m, "cpu");
+ seq_puts(m, "common_cpu");
else if (field_name) {
if (hist_field->flags & HIST_FIELD_FL_VAR_REF ||
hist_field->flags & HIST_FIELD_FL_ALIAS)
--
2.30.2
From: Haoran Luo <www(a)aegistudio.net>
The "rb_per_cpu_empty()" misinterpret the condition (as not-empty) when
"head_page" and "commit_page" of "struct ring_buffer_per_cpu" points to
the same buffer page, whose "buffer_data_page" is empty and "read" field
is non-zero.
An error scenario could be constructed as followed (kernel perspective):
1. All pages in the buffer has been accessed by reader(s) so that all of
them will have non-zero "read" field.
2. Read and clear all buffer pages so that "rb_num_of_entries()" will
return 0 rendering there's no more data to read. It is also required
that the "read_page", "commit_page" and "tail_page" points to the same
page, while "head_page" is the next page of them.
3. Invoke "ring_buffer_lock_reserve()" with large enough "length"
so that it shot pass the end of current tail buffer page. Now the
"head_page", "commit_page" and "tail_page" points to the same page.
4. Discard current event with "ring_buffer_discard_commit()", so that
"head_page", "commit_page" and "tail_page" points to a page whose buffer
data page is now empty.
When the error scenario has been constructed, "tracing_read_pipe" will
be trapped inside a deadloop: "trace_empty()" returns 0 since
"rb_per_cpu_empty()" returns 0 when it hits the CPU containing such
constructed ring buffer. Then "trace_find_next_entry_inc()" always
return NULL since "rb_num_of_entries()" reports there's no more entry
to read. Finally "trace_seq_to_user()" returns "-EBUSY" spanking
"tracing_read_pipe" back to the start of the "waitagain" loop.
I've also written a proof-of-concept script to construct the scenario
and trigger the bug automatically, you can use it to trace and validate
my reasoning above:
https://github.com/aegistudio/RingBufferDetonator.git
Tests has been carried out on linux kernel 5.14-rc2
(2734d6c1b1a089fb593ef6a23d4b70903526fe0c), my fixed version
of kernel (for testing whether my update fixes the bug) and
some older kernels (for range of affected kernels). Test result is
also attached to the proof-of-concept repository.
Link: https://lore.kernel.org/linux-trace-devel/YPaNxsIlb2yjSi5Y@aegistudio/
Link: https://lore.kernel.org/linux-trace-devel/YPgrN85WL9VyrZ55@aegistudio
Cc: stable(a)vger.kernel.org
Fixes: bf41a158cacba ("ring-buffer: make reentrant")
Suggested-by: Linus Torvalds <torvalds(a)linuxfoundation.org>
Signed-off-by: Haoran Luo <www(a)aegistudio.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index d1463eac11a3..e592d1df6f88 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3880,10 +3880,30 @@ static bool rb_per_cpu_empty(struct ring_buffer_per_cpu *cpu_buffer)
if (unlikely(!head))
return true;
- return reader->read == rb_page_commit(reader) &&
- (commit == reader ||
- (commit == head &&
- head->read == rb_page_commit(commit)));
+ /* Reader should exhaust content in reader page */
+ if (reader->read != rb_page_commit(reader))
+ return false;
+
+ /*
+ * If writers are committing on the reader page, knowing all
+ * committed content has been read, the ring buffer is empty.
+ */
+ if (commit == reader)
+ return true;
+
+ /*
+ * If writers are committing on a page other than reader page
+ * and head page, there should always be content to read.
+ */
+ if (commit != head)
+ return false;
+
+ /*
+ * Writers are committing on the head page, we just need
+ * to care about there're committed data, and the reader will
+ * swap reader page with head page when it is to read data.
+ */
+ return rb_page_commit(commit) == 0;
}
/**
--
2.30.2
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9992a078b1771da354ac1f9737e1e639b687caa2 Mon Sep 17 00:00:00 2001
From: Hangbin Liu <liuhangbin(a)gmail.com>
Date: Fri, 9 Jul 2021 11:45:02 +0800
Subject: [PATCH] net: ip_tunnel: fix mtu calculation for ETHER tunnel devices
Commit 28e104d00281 ("net: ip_tunnel: fix mtu calculation") removed
dev->hard_header_len subtraction when calculate MTU for tunnel devices
as there is an overhead for device that has header_ops.
But there are ETHER tunnel devices, like gre_tap or erspan, which don't
have header_ops but set dev->hard_header_len during setup. This makes
pkts greater than (MTU - ETH_HLEN) could not be xmited. Fix it by
subtracting the ETHER tunnel devices' dev->hard_header_len for MTU
calculation.
Fixes: 28e104d00281 ("net: ip_tunnel: fix mtu calculation")
Reported-by: Jianlin Shi <jishi(a)redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index f6cc26de5ed3..0dca00745ac3 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -317,7 +317,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev)
}
dev->needed_headroom = t_hlen + hlen;
- mtu -= t_hlen;
+ mtu -= t_hlen + (dev->type == ARPHRD_ETHER ? dev->hard_header_len : 0);
if (mtu < IPV4_MIN_MTU)
mtu = IPV4_MIN_MTU;
@@ -348,6 +348,9 @@ static struct ip_tunnel *ip_tunnel_create(struct net *net,
t_hlen = nt->hlen + sizeof(struct iphdr);
dev->min_mtu = ETH_MIN_MTU;
dev->max_mtu = IP_MAX_MTU - t_hlen;
+ if (dev->type == ARPHRD_ETHER)
+ dev->max_mtu -= dev->hard_header_len;
+
ip_tunnel_add(itn, nt);
return nt;
@@ -489,11 +492,14 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
tunnel_hlen = md ? tunnel_hlen : tunnel->hlen;
pkt_size = skb->len - tunnel_hlen;
+ pkt_size -= dev->type == ARPHRD_ETHER ? dev->hard_header_len : 0;
- if (df)
+ if (df) {
mtu = dst_mtu(&rt->dst) - (sizeof(struct iphdr) + tunnel_hlen);
- else
+ mtu -= dev->type == ARPHRD_ETHER ? dev->hard_header_len : 0;
+ } else {
mtu = skb_valid_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
+ }
if (skb_valid_dst(skb))
skb_dst_update_pmtu_no_confirm(skb, mtu);
@@ -972,6 +978,9 @@ int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict)
int t_hlen = tunnel->hlen + sizeof(struct iphdr);
int max_mtu = IP_MAX_MTU - t_hlen;
+ if (dev->type == ARPHRD_ETHER)
+ max_mtu -= dev->hard_header_len;
+
if (new_mtu < ETH_MIN_MTU)
return -EINVAL;
@@ -1149,6 +1158,9 @@ int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
if (tb[IFLA_MTU]) {
unsigned int max = IP_MAX_MTU - (nt->hlen + sizeof(struct iphdr));
+ if (dev->type == ARPHRD_ETHER)
+ max -= dev->hard_header_len;
+
mtu = clamp(dev->mtu, (unsigned int)ETH_MIN_MTU, max);
}
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 873b8bf410d6 - mt76: mt7921: continue to probe driver when fw already downloaded
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefi…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ ACPI table test
✅ ACPI enabled test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ i2c: i2cdetect sanity
🚧 ✅ Firmware test suite
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
ppc64le:
Host 1:
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ❌ Podman system integration test - as root
🚧 ❌ Podman system integration test - as user
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
🚧 ✅ Storage block - filesystem fio test
🚧 ✅ Storage block - queue scheduler test
🚧 ✅ Storage nvme - tcp
🚧 ✅ Storage: lvm device-mapper test - upstream
Host 2:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
s390x:
Host 1:
✅ Boot test
✅ Reboot test
✅ selinux-policy: serge-testsuite
✅ Storage: swraid mdadm raid_module test
🚧 ❌ Podman system integration test - as root
🚧 ❌ Podman system integration test - as user
🚧 ✅ Storage blktests
🚧 ✅ Storage nvme - tcp
🚧 ✅ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ trace: ftrace/tracer
🚧 ✅ xarray-idr-radixtree-test
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
x86_64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ Reboot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ xfstests - nfsv4.2
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ power-management: cpupower/sanity test
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Host 2:
✅ Boot test
✅ Reboot test
✅ ACPI table test
✅ LTP
✅ CIFS Connectathon
✅ POSIX pjd-fstest suites
✅ Loopdev Sanity
✅ jvm - jcstress tests
✅ Memory: fork_mem
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking cki netfilter test
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
✅ trace: ftrace/tracer
🚧 ❌ xarray-idr-radixtree-test
🚧 ✅ i2c: i2cdetect sanity
🚧 ✅ Firmware test suite
🚧 ✅ Memory function: kaslr
🚧 ✅ audit: audit testsuite test
🚧 ✅ lvm cache test
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Reboot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ xfstests - nfsv4.2
✅ selinux-policy: serge-testsuite
✅ power-management: cpupower/sanity test
✅ storage: software RAID testing
✅ Storage: swraid mdadm raid_module test
🚧 ⚡⚡⚡ Podman system integration test - as root
🚧 ⚡⚡⚡ Podman system integration test - as user
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ xfstests - cifsv3.11
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
🚧 ⚡⚡⚡ Storage block - filesystem fio test
🚧 ⚡⚡⚡ Storage block - queue scheduler test
🚧 ⚡⚡⚡ Storage nvme - tcp
🚧 ⚡⚡⚡ Storage: lvm device-mapper test - upstream
🚧 ⚡⚡⚡ stress: stress-ng
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
[ Upstream commit 5a3c680aa2c12c90c44af383fe6882a39875ab81 ]
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 15 +--------------
.../net/ethernet/broadcom/genet/bcmgenet_wol.c | 6 ------
2 files changed, 1 insertion(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 2921ae13db28..5637adff1888 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1094,7 +1094,7 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_PHY |
- EXT_PWR_DOWN_BIAS);
+ EXT_PWR_DOWN_BIAS | EXT_ENERGY_DET_MASK);
/* fallthrough */
case GENET_POWER_CABLE_SENSE:
/* enable APD */
@@ -2815,12 +2815,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -3510,7 +3504,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
int ret;
- u32 reg;
if (!netif_running(dev))
return 0;
@@ -3545,12 +3538,6 @@ static int bcmgenet_resume(struct device *d)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
if (priv->wolopts)
bcmgenet_power_up(priv, GENET_POWER_WOL_MAGIC);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index b97122926d3a..df107ed67220 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -167,12 +167,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Enable the MPD interrupt */
cpu_mask_clear = UMAC_IRQ_MPD_R;
--
2.25.1
[ Upstream commit 5a3c680aa2c12c90c44af383fe6882a39875ab81 ]
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
---
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 16 ++--------------
.../net/ethernet/broadcom/genet/bcmgenet_wol.c | 6 ------
2 files changed, 2 insertions(+), 20 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 21669a42718c..c4c5ca02a26c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1187,7 +1187,8 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
- reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS);
+ reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS |
+ EXT_ENERGY_DET_MASK);
if (GENET_IS_V5(priv)) {
reg &= ~(EXT_PWR_DOWN_PHY_EN |
EXT_PWR_DOWN_PHY_RD |
@@ -2895,12 +2896,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -3617,7 +3612,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
int ret;
- u32 reg;
if (!netif_running(dev))
return 0;
@@ -3649,12 +3643,6 @@ static int bcmgenet_resume(struct device *d)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
if (priv->wolopts)
bcmgenet_power_up(priv, GENET_POWER_WOL_MAGIC);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index a41f82379369..164988f3b4fa 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -160,12 +160,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
return 0;
}
--
2.25.1
The patch titled
Subject: ocfs2: issue zeroout to EOF blocks
has been added to the -mm tree. Its filename is
ocfs2-issue-zeroout-to-eof-blocks.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/ocfs2-issue-zeroout-to-eof-blocks…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-issue-zeroout-to-eof-blocks…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Junxiao Bi <junxiao.bi(a)oracle.com>
Subject: ocfs2: issue zeroout to EOF blocks
For punch holes in EOF blocks, fallocate used buffer write to zero the EOF
blocks in last cluster. But since ->writepage will ignore EOF pages,
those zeros will not be flushed.
This "looks" ok as commit 6bba4471f0cc ("ocfs2: fix data corruption by
fallocate") will zero the EOF blocks when extend the file size, but it
isn't. The problem happened on those EOF pages, before writeback, those
pages had DIRTY flag set and all buffer_head in them also had DIRTY flag
set, when writeback run by write_cache_pages(), DIRTY flag on the page was
cleared, but DIRTY flag on the buffer_head not.
When next write happened to those EOF pages, since buffer_head already had
DIRTY flag set, it would not mark page DIRTY again. That made writeback
ignore them forever. That will cause data corruption. Even directio
write can't work because it will fail when trying to drop pages caches
before direct io, as it found the buffer_head for those pages still had
DIRTY flag set, then it will fall back to buffer io mode.
To make a summary of the issue, as writeback ingores EOF pages, once any
EOF page is generated, any write to it will only go to the page cache, it
will never be flushed to disk even file size extends and that page is not
EOF page any more. The fix is to avoid zero EOF blocks with buffer write.
The following code snippet from qemu-img could trigger the corruption.
656 open("6b3711ae-3306-4bdd-823c-cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11
...
660 fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2275868672, 327680 <unfinished ...>
660 fallocate(11, 0, 2275868672, 327680) = 0
658 pwrite64(11, "
Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/file.c | 99 +++++++++++++++++++++++++++-------------------
1 file changed, 60 insertions(+), 39 deletions(-)
--- a/fs/ocfs2/file.c~ocfs2-issue-zeroout-to-eof-blocks
+++ a/fs/ocfs2/file.c
@@ -1529,6 +1529,45 @@ static void ocfs2_truncate_cluster_pages
}
}
+/*
+ * zero out partial blocks of one cluster.
+ *
+ * start: file offset where zero starts, will be made upper block aligned.
+ * len: it will be trimmed to the end of current cluster if "start + len"
+ * is bigger than it.
+ */
+static int ocfs2_zeroout_partial_cluster(struct inode *inode,
+ u64 start, u64 len)
+{
+ int ret;
+ u64 start_block, end_block, nr_blocks;
+ u64 p_block, offset;
+ u32 cluster, p_cluster, nr_clusters;
+ struct super_block *sb = inode->i_sb;
+ u64 end = ocfs2_align_bytes_to_clusters(sb, start);
+
+ if (start + len < end)
+ end = start + len;
+
+ start_block = ocfs2_blocks_for_bytes(sb, start);
+ end_block = ocfs2_blocks_for_bytes(sb, end);
+ nr_blocks = end_block - start_block;
+ if (!nr_blocks)
+ return 0;
+
+ cluster = ocfs2_bytes_to_clusters(sb, start);
+ ret = ocfs2_get_clusters(inode, cluster, &p_cluster,
+ &nr_clusters, NULL);
+ if (ret)
+ return ret;
+ if (!p_cluster)
+ return 0;
+
+ offset = start_block - ocfs2_clusters_to_blocks(sb, cluster);
+ p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset;
+ return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS);
+}
+
static int ocfs2_zero_partial_clusters(struct inode *inode,
u64 start, u64 len)
{
@@ -1538,6 +1577,7 @@ static int ocfs2_zero_partial_clusters(s
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
unsigned int csize = osb->s_clustersize;
handle_t *handle;
+ loff_t isize = i_size_read(inode);
/*
* The "start" and "end" values are NOT necessarily part of
@@ -1558,6 +1598,26 @@ static int ocfs2_zero_partial_clusters(s
if ((start & (csize - 1)) == 0 && (end & (csize - 1)) == 0)
goto out;
+ /* No page cache for EOF blocks, issue zero out to disk. */
+ if (end > isize) {
+ /*
+ * zeroout eof blocks in last cluster starting from
+ * "isize" even "start" > "isize" because it is
+ * complicated to zeroout just at "start" as "start"
+ * may be not aligned with block size, buffer write
+ * would be required to do that, but out of eof buffer
+ * write is not supported.
+ */
+ ret = ocfs2_zeroout_partial_cluster(inode, isize,
+ end - isize);
+ if (ret) {
+ mlog_errno(ret);
+ goto out;
+ }
+ if (start >= isize)
+ goto out;
+ end = isize;
+ }
handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS);
if (IS_ERR(handle)) {
ret = PTR_ERR(handle);
@@ -1856,45 +1916,6 @@ out:
}
/*
- * zero out partial blocks of one cluster.
- *
- * start: file offset where zero starts, will be made upper block aligned.
- * len: it will be trimmed to the end of current cluster if "start + len"
- * is bigger than it.
- */
-static int ocfs2_zeroout_partial_cluster(struct inode *inode,
- u64 start, u64 len)
-{
- int ret;
- u64 start_block, end_block, nr_blocks;
- u64 p_block, offset;
- u32 cluster, p_cluster, nr_clusters;
- struct super_block *sb = inode->i_sb;
- u64 end = ocfs2_align_bytes_to_clusters(sb, start);
-
- if (start + len < end)
- end = start + len;
-
- start_block = ocfs2_blocks_for_bytes(sb, start);
- end_block = ocfs2_blocks_for_bytes(sb, end);
- nr_blocks = end_block - start_block;
- if (!nr_blocks)
- return 0;
-
- cluster = ocfs2_bytes_to_clusters(sb, start);
- ret = ocfs2_get_clusters(inode, cluster, &p_cluster,
- &nr_clusters, NULL);
- if (ret)
- return ret;
- if (!p_cluster)
- return 0;
-
- offset = start_block - ocfs2_clusters_to_blocks(sb, cluster);
- p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset;
- return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS);
-}
-
-/*
* Parts of this function taken from xfs_change_file_space()
*/
static int __ocfs2_change_file_space(struct file *file, struct inode *inode,
_
Patches currently in -mm which might be from junxiao.bi(a)oracle.com are
ocfs2-fix-zero-out-valid-data.patch
ocfs2-issue-zeroout-to-eof-blocks.patch
The patch titled
Subject: ocfs2: fix zero out valid data
has been added to the -mm tree. Its filename is
ocfs2-fix-zero-out-valid-data.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/ocfs2-fix-zero-out-valid-data.pat…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-fix-zero-out-valid-data.pat…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Junxiao Bi <junxiao.bi(a)oracle.com>
Subject: ocfs2: fix zero out valid data
If append-dio feature is enabled, direct-io write and fallocate could run
in parallel to extend file size, fallocate used "orig_isize" to record
i_size before taking "ip_alloc_sem", when ocfs2_zeroout_partial_cluster()
zeroout EOF blocks, i_size maybe already extended by
ocfs2_dio_end_io_write(), that will cause valid data zeroed out.
Link: https://lkml.kernel.org/r/20210722054923.24389-1-junxiao.bi@oracle.com
Fixes: 6bba4471f0cc ("ocfs2: fix data corruption by fallocate")
Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Gang He <ghe(a)suse.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/file.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ocfs2/file.c~ocfs2-fix-zero-out-valid-data
+++ a/fs/ocfs2/file.c
@@ -1935,7 +1935,6 @@ static int __ocfs2_change_file_space(str
goto out_inode_unlock;
}
- orig_isize = i_size_read(inode);
switch (sr->l_whence) {
case 0: /*SEEK_SET*/
break;
@@ -1943,7 +1942,7 @@ static int __ocfs2_change_file_space(str
sr->l_start += f_pos;
break;
case 2: /*SEEK_END*/
- sr->l_start += orig_isize;
+ sr->l_start += i_size_read(inode);
break;
default:
ret = -EINVAL;
@@ -1998,6 +1997,7 @@ static int __ocfs2_change_file_space(str
ret = -EINVAL;
}
+ orig_isize = i_size_read(inode);
/* zeroout eof blocks in the cluster. */
if (!ret && change_size && orig_isize < size) {
ret = ocfs2_zeroout_partial_cluster(inode, orig_isize,
_
Patches currently in -mm which might be from junxiao.bi(a)oracle.com are
ocfs2-fix-zero-out-valid-data.patch
ocfs2-issue-zeroout-to-eof-blocks.patch
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f2165627319ffd33a6217275e5690b1ab5c45763 Mon Sep 17 00:00:00 2001
From: David Sterba <dsterba(a)suse.com>
Date: Mon, 14 Jun 2021 12:45:18 +0200
Subject: [PATCH] btrfs: compression: don't try to compress if we don't have
enough pages
The early check if we should attempt compression does not take into
account the number of input pages. It can happen that there's only one
page, eg. a tail page after some ranges of the BTRFS_MAX_UNCOMPRESSED
have been processed, or an isolated page that won't be converted to an
inline extent.
The single page would be compressed but a later check would drop it
again because the result size must be at least one block shorter than
the input. That can never work with just one page.
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a2494c645681..e6eb20987351 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -629,7 +629,7 @@ static noinline int compress_file_range(struct async_chunk *async_chunk)
* inode has not been flagged as nocompress. This flag can
* change at any time if we discover bad compression ratios.
*/
- if (inode_need_compress(BTRFS_I(inode), start, end)) {
+ if (nr_pages > 1 && inode_need_compress(BTRFS_I(inode), start, end)) {
WARN_ON(pages);
pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
if (!pages) {
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 56bc54496f5d6bc638127bfc9df3742cbf0039e7 Mon Sep 17 00:00:00 2001
From: Adam Ford <aford173(a)gmail.com>
Date: Thu, 13 May 2021 06:46:15 -0500
Subject: [PATCH] arm64: dts: renesas: beacon: Fix USB extal reference
The USB extal clock reference isn't associated to a crystal, it's
associated to a programmable clock, so remove the extal reference,
add the usb2_clksel. Since usb_extal is referenced by the versaclock,
reference it here so the usb2_clksel can get the proper clock speed
of 50MHz.
Signed-off-by: Adam Ford <aford173(a)gmail.com>
Link: https://lore.kernel.org/r/20210513114617.30191-1-aford173@gmail.com
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
diff --git a/arch/arm64/boot/dts/renesas/beacon-renesom-som.dtsi b/arch/arm64/boot/dts/renesas/beacon-renesom-som.dtsi
index 75355c354c38..090dc9c4f57b 100644
--- a/arch/arm64/boot/dts/renesas/beacon-renesom-som.dtsi
+++ b/arch/arm64/boot/dts/renesas/beacon-renesom-som.dtsi
@@ -321,8 +321,10 @@ &sdhi3 {
status = "okay";
};
-&usb_extal_clk {
- clock-frequency = <50000000>;
+&usb2_clksel {
+ clocks = <&cpg CPG_MOD 703>, <&cpg CPG_MOD 704>,
+ <&versaclock5 3>, <&usb3s0_clk>;
+ status = "okay";
};
&usb3s0_clk {
By default there is no rtnetlink event generated when registering a
netdev with rtnl_link_ops until its rtnl_link_state is switched to
initialized (RTNL_LINK_INITIALIZED). This causes issues with user
tools like NetworkManager which relies on such event to manage links.
Fix that by setting link to initialized (via rtnl_configure_link).
Cc: stable(a)vger.kernel.org
Fixes: 88b710532e53 ("wwan: add interface creation support")
Signed-off-by: Loic Poulain <loic.poulain(a)linaro.org>
---
drivers/net/wwan/wwan_core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/wwan/wwan_core.c b/drivers/net/wwan/wwan_core.c
index 3e16c31..409caf4 100644
--- a/drivers/net/wwan/wwan_core.c
+++ b/drivers/net/wwan/wwan_core.c
@@ -866,6 +866,10 @@ static int wwan_rtnl_newlink(struct net *src_net, struct net_device *dev,
else
ret = register_netdevice(dev);
+ /* Link initialized, notify new link */
+ if (!ret)
+ rtnl_configure_link(dev, NULL);
+
out:
/* release the reference */
put_device(&wwandev->dev);
--
2.7.4
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 52dfd86aa568e433b24357bb5fc725560f1e22d8 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Thu, 21 Jan 2021 12:54:18 +0100
Subject: [PATCH] vboxsf: Add support for the atomic_open directory-inode op
Opening a new file is done in 2 steps on regular filesystems:
1. Call the create inode-op on the parent-dir to create an inode
to hold the meta-data related to the file.
2. Call the open file-op to get a handle for the file.
vboxsf however does not really use disk-backed inodes because it
is based on passing through file-related system-calls through to
the hypervisor. So both steps translate to an open(2) call being
passed through to the hypervisor. With the handle returned by
the first call immediately being closed again.
Making 2 open calls for a single open(..., O_CREATE, ...) calls
has 2 problems:
a) It is not really efficient.
b) It actually breaks some apps.
An example of b) is doing a git clone inside a vboxsf mount.
When git clone tries to create a tempfile to store the pak
files which is downloading the following happens:
1. vboxsf_dir_mkfile() gets called with a mode of 0444 and succeeds.
2. vboxsf_file_open() gets called with file->f_flags containing
O_RDWR. When the host is a Linux machine this fails because doing
a open(..., O_RDWR) on a file which exists and has mode 0444 results
in an -EPERM error.
Other network-filesystems and fuse avoid the problem of needing to
pass 2 open() calls to the other side by using the atomic_open
directory-inode op.
This commit fixes git clone not working inside a vboxsf mount,
by adding support for the atomic_open directory-inode op.
As an added bonus this should also make opening new files faster.
The atomic_open implementation is modelled after the atomic_open
implementations from the 9p and fuse code.
Fixes: 0fd169576648 ("fs: Add VirtualBox guest shared folder (vboxsf) support")
Reported-by: Ludovic Pouzenc <bugreports(a)pouzenc.fr>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
diff --git a/fs/vboxsf/dir.c b/fs/vboxsf/dir.c
index 87d5b2fef592..c4769a9396c5 100644
--- a/fs/vboxsf/dir.c
+++ b/fs/vboxsf/dir.c
@@ -308,6 +308,53 @@ static int vboxsf_dir_mkdir(struct user_namespace *mnt_userns,
return vboxsf_dir_create(parent, dentry, mode, true, true, NULL);
}
+static int vboxsf_dir_atomic_open(struct inode *parent, struct dentry *dentry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ struct vboxsf_sbi *sbi = VBOXSF_SBI(parent->i_sb);
+ struct vboxsf_handle *sf_handle;
+ struct dentry *res = NULL;
+ u64 handle;
+ int err;
+
+ if (d_in_lookup(dentry)) {
+ res = vboxsf_dir_lookup(parent, dentry, 0);
+ if (IS_ERR(res))
+ return PTR_ERR(res);
+
+ if (res)
+ dentry = res;
+ }
+
+ /* Only creates */
+ if (!(flags & O_CREAT) || d_really_is_positive(dentry))
+ return finish_no_open(file, res);
+
+ err = vboxsf_dir_create(parent, dentry, mode, false, flags & O_EXCL, &handle);
+ if (err)
+ goto out;
+
+ sf_handle = vboxsf_create_sf_handle(d_inode(dentry), handle, SHFL_CF_ACCESS_READWRITE);
+ if (IS_ERR(sf_handle)) {
+ vboxsf_close(sbi->root, handle);
+ err = PTR_ERR(sf_handle);
+ goto out;
+ }
+
+ err = finish_open(file, dentry, generic_file_open);
+ if (err) {
+ /* This also closes the handle passed to vboxsf_create_sf_handle() */
+ vboxsf_release_sf_handle(d_inode(dentry), sf_handle);
+ goto out;
+ }
+
+ file->private_data = sf_handle;
+ file->f_mode |= FMODE_CREATED;
+out:
+ dput(res);
+ return err;
+}
+
static int vboxsf_dir_unlink(struct inode *parent, struct dentry *dentry)
{
struct vboxsf_sbi *sbi = VBOXSF_SBI(parent->i_sb);
@@ -428,6 +475,7 @@ const struct inode_operations vboxsf_dir_iops = {
.lookup = vboxsf_dir_lookup,
.create = vboxsf_dir_mkfile,
.mkdir = vboxsf_dir_mkdir,
+ .atomic_open = vboxsf_dir_atomic_open,
.rmdir = vboxsf_dir_unlink,
.unlink = vboxsf_dir_unlink,
.rename = vboxsf_dir_rename,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 358ed624207012f03318235017ac6fb41f8af592 Mon Sep 17 00:00:00 2001
From: Talal Ahmad <talalahmad(a)google.com>
Date: Fri, 9 Jul 2021 11:43:06 -0400
Subject: [PATCH] tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy
path
sk_wmem_schedule makes sure that sk_forward_alloc has enough
bytes for charging that is going to be done by sk_mem_charge.
In the transmit zerocopy path, there is sk_mem_charge but there was
no call to sk_wmem_schedule. This change adds that call.
Without this call to sk_wmem_schedule, sk_forward_alloc can go
negetive which is a bug because sk_forward_alloc is a per-socket
space that has been forward charged so this can't be negative.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Talal Ahmad <talalahmad(a)google.com>
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Reviewed-by: Wei Wang <weiwan(a)google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d5ab5f243640..8cb44040ec68 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1375,6 +1375,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
}
pfrag->offset += copy;
} else {
+ if (!sk_wmem_schedule(sk, copy))
+ goto wait_for_space;
+
err = skb_zerocopy_iter_stream(sk, skb, msg, copy, uarg);
if (err == -EMSGSIZE || err == -EEXIST) {
tcp_mark_push(tp, skb);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 358ed624207012f03318235017ac6fb41f8af592 Mon Sep 17 00:00:00 2001
From: Talal Ahmad <talalahmad(a)google.com>
Date: Fri, 9 Jul 2021 11:43:06 -0400
Subject: [PATCH] tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy
path
sk_wmem_schedule makes sure that sk_forward_alloc has enough
bytes for charging that is going to be done by sk_mem_charge.
In the transmit zerocopy path, there is sk_mem_charge but there was
no call to sk_wmem_schedule. This change adds that call.
Without this call to sk_wmem_schedule, sk_forward_alloc can go
negetive which is a bug because sk_forward_alloc is a per-socket
space that has been forward charged so this can't be negative.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Talal Ahmad <talalahmad(a)google.com>
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Reviewed-by: Wei Wang <weiwan(a)google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d5ab5f243640..8cb44040ec68 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1375,6 +1375,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
}
pfrag->offset += copy;
} else {
+ if (!sk_wmem_schedule(sk, copy))
+ goto wait_for_space;
+
err = skb_zerocopy_iter_stream(sk, skb, msg, copy, uarg);
if (err == -EMSGSIZE || err == -EEXIST) {
tcp_mark_push(tp, skb);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 358ed624207012f03318235017ac6fb41f8af592 Mon Sep 17 00:00:00 2001
From: Talal Ahmad <talalahmad(a)google.com>
Date: Fri, 9 Jul 2021 11:43:06 -0400
Subject: [PATCH] tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy
path
sk_wmem_schedule makes sure that sk_forward_alloc has enough
bytes for charging that is going to be done by sk_mem_charge.
In the transmit zerocopy path, there is sk_mem_charge but there was
no call to sk_wmem_schedule. This change adds that call.
Without this call to sk_wmem_schedule, sk_forward_alloc can go
negetive which is a bug because sk_forward_alloc is a per-socket
space that has been forward charged so this can't be negative.
Fixes: f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Talal Ahmad <talalahmad(a)google.com>
Reviewed-by: Willem de Bruijn <willemb(a)google.com>
Reviewed-by: Wei Wang <weiwan(a)google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil(a)google.com>
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d5ab5f243640..8cb44040ec68 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1375,6 +1375,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
}
pfrag->offset += copy;
} else {
+ if (!sk_wmem_schedule(sk, copy))
+ goto wait_for_space;
+
err = skb_zerocopy_iter_stream(sk, skb, msg, copy, uarg);
if (err == -EMSGSIZE || err == -EEXIST) {
tcp_mark_push(tp, skb);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From cc3ddee97cff034cea4d095de4a484c92a219bf5 Mon Sep 17 00:00:00 2001
From: Hans de Goede <hdegoede(a)redhat.com>
Date: Thu, 21 Jan 2021 10:08:59 +0100
Subject: [PATCH] vboxsf: Honor excl flag to the dir-inode create op
Honor the excl flag to the dir-inode create op, instead of behaving
as if it is always set.
Note the old behavior still worked most of the time since a non-exclusive
open only calls the create op, if there is a race and the file is created
between the dentry lookup and the calling of the create call.
While at it change the type of the is_dir parameter to the
vboxsf_dir_create() helper from an int to a bool, to be consistent with
the use of bool for the excl parameter.
Fixes: 0fd169576648 ("fs: Add VirtualBox guest shared folder (vboxsf) support")
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
diff --git a/fs/vboxsf/dir.c b/fs/vboxsf/dir.c
index eac6788fc6cf..6ff84343edca 100644
--- a/fs/vboxsf/dir.c
+++ b/fs/vboxsf/dir.c
@@ -253,7 +253,7 @@ static int vboxsf_dir_instantiate(struct inode *parent, struct dentry *dentry,
}
static int vboxsf_dir_create(struct inode *parent, struct dentry *dentry,
- umode_t mode, int is_dir)
+ umode_t mode, bool is_dir, bool excl)
{
struct vboxsf_inode *sf_parent_i = VBOXSF_I(parent);
struct vboxsf_sbi *sbi = VBOXSF_SBI(parent->i_sb);
@@ -261,10 +261,12 @@ static int vboxsf_dir_create(struct inode *parent, struct dentry *dentry,
int err;
params.handle = SHFL_HANDLE_NIL;
- params.create_flags = SHFL_CF_ACT_CREATE_IF_NEW |
- SHFL_CF_ACT_FAIL_IF_EXISTS |
- SHFL_CF_ACCESS_READWRITE |
- (is_dir ? SHFL_CF_DIRECTORY : 0);
+ params.create_flags = SHFL_CF_ACT_CREATE_IF_NEW | SHFL_CF_ACCESS_READWRITE;
+ if (is_dir)
+ params.create_flags |= SHFL_CF_DIRECTORY;
+ if (excl)
+ params.create_flags |= SHFL_CF_ACT_FAIL_IF_EXISTS;
+
params.info.attr.mode = (mode & 0777) |
(is_dir ? SHFL_TYPE_DIRECTORY : SHFL_TYPE_FILE);
params.info.attr.additional = SHFLFSOBJATTRADD_NOTHING;
@@ -292,14 +294,14 @@ static int vboxsf_dir_mkfile(struct user_namespace *mnt_userns,
struct inode *parent, struct dentry *dentry,
umode_t mode, bool excl)
{
- return vboxsf_dir_create(parent, dentry, mode, 0);
+ return vboxsf_dir_create(parent, dentry, mode, false, excl);
}
static int vboxsf_dir_mkdir(struct user_namespace *mnt_userns,
struct inode *parent, struct dentry *dentry,
umode_t mode)
{
- return vboxsf_dir_create(parent, dentry, mode, 1);
+ return vboxsf_dir_create(parent, dentry, mode, true, true);
}
static int vboxsf_dir_unlink(struct inode *parent, struct dentry *dentry)
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 174a1dcc96429efce4ef7eb2f5c4506480da2182 Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <masahiroy(a)kernel.org>
Date: Mon, 17 May 2021 16:03:13 +0900
Subject: [PATCH] kbuild: sink stdout from cmd for silent build
When building with 'make -s', no output to stdout should be printed.
As Arnd Bergmann reported [1], mkimage shows the detailed information
of the generated images.
I think this should be suppressed by the 'cmd' macro instead of by
individual scripts.
Insert 'exec >/dev/null;' in order to redirect stdout to /dev/null for
silent builds.
[Note about this implementation]
'exec >/dev/null;' may look somewhat tricky, but this has a reason.
Appending '>/dev/null' at the end of command line is a common way for
redirection, so I first tried this:
cmd = @set -e; $(echo-cmd) $(cmd_$(1)) >/dev/null
... but it would not work if $(cmd_$(1)) itself contains a redirection.
For example, cmd_wrap in scripts/Makefile.asm-generic redirects the
output from the 'echo' command into the target file.
It would be expanded into:
echo "#include <asm-generic/$*.h>" > $@ >/dev/null
Then, the target file gets empty because the string will go to /dev/null
instead of $@.
Next, I tried this:
cmd = @set -e; $(echo-cmd) { $(cmd_$(1)); } >/dev/null
The form above would be expanded into:
{ echo "#include <asm-generic/$*.h>" > $@; } >/dev/null
This works as expected. However, it would be a syntax error if
$(cmd_$(1)) is empty.
When CONFIG_TRIM_UNUSED_KSYMS is disabled, $(call cmd,gen_ksymdeps) in
scripts/Makefile.build would be expanded into:
set -e; { ; } >/dev/null
..., which causes an syntax error.
I also tried this:
cmd = @set -e; $(echo-cmd) ( $(cmd_$(1)) ) >/dev/null
... but this causes a syntax error for the same reason.
So, finally I adopted:
cmd = @set -e; $(echo-cmd) exec >/dev/null; $(cmd_$(1))
[1]: https://lore.kernel.org/lkml/20210514135752.2910387-1-arnd@kernel.org/
Signed-off-by: Masahiro Yamada <masahiroy(a)kernel.org>
diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 82dd1b65b7a8..f247e691562d 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -90,8 +90,13 @@ clean := -f $(srctree)/scripts/Makefile.clean obj
echo-cmd = $(if $($(quiet)cmd_$(1)),\
echo ' $(call escsq,$($(quiet)cmd_$(1)))$(echo-why)';)
+# sink stdout for 'make -s'
+ redirect :=
+ quiet_redirect :=
+silent_redirect := exec >/dev/null;
+
# printing commands
-cmd = @set -e; $(echo-cmd) $(cmd_$(1))
+cmd = @set -e; $(echo-cmd) $($(quiet)redirect) $(cmd_$(1))
###
# if_changed - execute command if any prerequisite is newer than
The patch below does not apply to the 5.13-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b0b33b048dcfbd7da82c3cde4fab02751dfab4d6 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Date: Tue, 13 Jul 2021 12:37:19 +0300
Subject: [PATCH] net: dsa: sja1105: fix address learning getting disabled on
the CPU port
In May 2019 when commit 640f763f98c2 ("net: dsa: sja1105: Add support
for Spanning Tree Protocol") was introduced, the comment that "STP does
not get called for the CPU port" was true. This changed after commit
0394a63acfe2 ("net: dsa: enable and disable all ports") in August 2019
and went largely unnoticed, because the sja1105_bridge_stp_state_set()
method did nothing different compared to the static setup done by
sja1105_init_mac_settings().
With the ability to turn address learning off introduced by the blamed
commit, there is a new priv->learn_ena port mask in the driver. When
sja1105_bridge_stp_state_set() gets called and we are in
BR_STATE_LEARNING or later, address learning is enabled or not depending
on priv->learn_ena & BIT(port).
So what happens is that priv->learn_ena is not being set from anywhere
for the CPU port, and the static configuration done by
sja1105_init_mac_settings() is being overwritten.
To solve this, acknowledge that the static configuration of STP state is
no longer necessary because the STP state is being set by the DSA core
now, but what is necessary is to set priv->learn_ena for the CPU port.
Fixes: 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 4f0545605f6b..ced8c9cb29c2 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -122,14 +122,12 @@ static int sja1105_init_mac_settings(struct sja1105_private *priv)
for (i = 0; i < ds->num_ports; i++) {
mac[i] = default_mac;
- if (i == dsa_upstream_port(priv->ds, i)) {
- /* STP doesn't get called for CPU port, so we need to
- * set the I/O parameters statically.
- */
- mac[i].dyn_learn = true;
- mac[i].ingress = true;
- mac[i].egress = true;
- }
+
+ /* Let sja1105_bridge_stp_state_set() keep address learning
+ * enabled for the CPU port.
+ */
+ if (dsa_is_cpu_port(ds, i))
+ priv->learn_ena |= BIT(i);
}
return 0;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a019abd8022061b917da767cd1a66ed823724eab Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller(a)proxmox.com>
Date: Fri, 2 Jul 2021 14:07:36 +0200
Subject: [PATCH] net: bridge: sync fdb to new unicast-filtering ports
Since commit 2796d0c648c9 ("bridge: Automatically manage
port promiscuous mode.")
bridges with `vlan_filtering 1` and only 1 auto-port don't
set IFF_PROMISC for unicast-filtering-capable ports.
Normally on port changes `br_manage_promisc` is called to
update the promisc flags and unicast filters if necessary,
but it cannot distinguish between *new* ports and ones
losing their promisc flag, and new ports end up not
receiving the MAC address list.
Fix this by calling `br_fdb_sync_static` in `br_add_if`
after the port promisc flags are updated and the unicast
filter was supposed to have been filled.
Fixes: 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode.")
Signed-off-by: Wolfgang Bumiller <w.bumiller(a)proxmox.com>
Acked-by: Nikolay Aleksandrov <nikolay(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index f7d2f472ae24..6e4a32354a13 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -562,7 +562,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
struct net_bridge_port *p;
int err = 0;
unsigned br_hr, dev_hr;
- bool changed_addr;
+ bool changed_addr, fdb_synced = false;
/* Don't allow bridging non-ethernet like devices. */
if ((dev->flags & IFF_LOOPBACK) ||
@@ -652,6 +652,19 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
list_add_rcu(&p->list, &br->port_list);
nbp_update_port_count(br);
+ if (!br_promisc_port(p) && (p->dev->priv_flags & IFF_UNICAST_FLT)) {
+ /* When updating the port count we also update all ports'
+ * promiscuous mode.
+ * A port leaving promiscuous mode normally gets the bridge's
+ * fdb synced to the unicast filter (if supported), however,
+ * `br_port_clear_promisc` does not distinguish between
+ * non-promiscuous ports and *new* ports, so we need to
+ * sync explicitly here.
+ */
+ fdb_synced = br_fdb_sync_static(br, p) == 0;
+ if (!fdb_synced)
+ netdev_err(dev, "failed to sync bridge static fdb addresses to this port\n");
+ }
netdev_update_features(br->dev);
@@ -701,6 +714,8 @@ int br_add_if(struct net_bridge *br, struct net_device *dev,
return 0;
err7:
+ if (fdb_synced)
+ br_fdb_unsync_static(br, p);
list_del_rcu(&p->list);
br_fdb_delete_by_port(br, p, 0, 1);
nbp_update_port_count(br);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 953b0dcbe2e3f7bee98cc3bca2ec82c8298e9c16 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Beh=C3=BAn?= <kabel(a)kernel.org>
Date: Thu, 1 Jul 2021 00:22:31 +0200
Subject: [PATCH] net: dsa: mv88e6xxx: enable SerDes PCS register dump via
ethtool -d on Topaz
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit bf3504cea7d7e ("net: dsa: mv88e6xxx: Add 6390 family PCS
registers to ethtool -d") added support for dumping SerDes PCS registers
via ethtool -d for Peridot.
The same implementation is also valid for Topaz, but was not
enabled at the time.
Signed-off-by: Marek Behún <kabel(a)kernel.org>
Fixes: bf3504cea7d7e ("net: dsa: mv88e6xxx: Add 6390 family PCS registers to ethtool -d")
Reviewed-by: Andrew Lunn <andrew(a)lunn.ch>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 1e95a0facbd4..beb41572d04e 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3626,6 +3626,8 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
.serdes_get_sset_count = mv88e6390_serdes_get_sset_count,
.serdes_get_strings = mv88e6390_serdes_get_strings,
.serdes_get_stats = mv88e6390_serdes_get_stats,
+ .serdes_get_regs_len = mv88e6390_serdes_get_regs_len,
+ .serdes_get_regs = mv88e6390_serdes_get_regs,
.phylink_validate = mv88e6341_phylink_validate,
};
@@ -4435,6 +4437,8 @@ static const struct mv88e6xxx_ops mv88e6341_ops = {
.serdes_get_sset_count = mv88e6390_serdes_get_sset_count,
.serdes_get_strings = mv88e6390_serdes_get_strings,
.serdes_get_stats = mv88e6390_serdes_get_stats,
+ .serdes_get_regs_len = mv88e6390_serdes_get_regs_len,
+ .serdes_get_regs = mv88e6390_serdes_get_regs,
.phylink_validate = mv88e6341_phylink_validate,
};
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a03b98d68367b18e5db6d6850e2cc18754fba94a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Marek=20Beh=C3=BAn?= <kabel(a)kernel.org>
Date: Thu, 1 Jul 2021 00:22:30 +0200
Subject: [PATCH] net: dsa: mv88e6xxx: enable SerDes RX stats for Topaz
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit 0df952873636a ("mv88e6xxx: Add serdes Rx statistics") added
support for RX statistics on SerDes ports for Peridot.
This same implementation is also valid for Topaz, but was not enabled
at the time.
We need to use the generic .serdes_get_lane() method instead of the
Peridot specific one in the stats methods so that on Topaz the proper
one is used.
Signed-off-by: Marek Behún <kabel(a)kernel.org>
Fixes: 0df952873636a ("mv88e6xxx: Add serdes Rx statistics")
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 354ff0b84b7f..1e95a0facbd4 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3623,6 +3623,9 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
.serdes_irq_enable = mv88e6390_serdes_irq_enable,
.serdes_irq_status = mv88e6390_serdes_irq_status,
.gpio_ops = &mv88e6352_gpio_ops,
+ .serdes_get_sset_count = mv88e6390_serdes_get_sset_count,
+ .serdes_get_strings = mv88e6390_serdes_get_strings,
+ .serdes_get_stats = mv88e6390_serdes_get_stats,
.phylink_validate = mv88e6341_phylink_validate,
};
@@ -4429,6 +4432,9 @@ static const struct mv88e6xxx_ops mv88e6341_ops = {
.gpio_ops = &mv88e6352_gpio_ops,
.avb_ops = &mv88e6390_avb_ops,
.ptp_ops = &mv88e6352_ptp_ops,
+ .serdes_get_sset_count = mv88e6390_serdes_get_sset_count,
+ .serdes_get_strings = mv88e6390_serdes_get_strings,
+ .serdes_get_stats = mv88e6390_serdes_get_stats,
.phylink_validate = mv88e6341_phylink_validate,
};
diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c b/drivers/net/dsa/mv88e6xxx/serdes.c
index e4fbef81bc52..b1d46dd8eaab 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -722,7 +722,7 @@ static struct mv88e6390_serdes_hw_stat mv88e6390_serdes_hw_stats[] = {
int mv88e6390_serdes_get_sset_count(struct mv88e6xxx_chip *chip, int port)
{
- if (mv88e6390_serdes_get_lane(chip, port) < 0)
+ if (mv88e6xxx_serdes_get_lane(chip, port) < 0)
return 0;
return ARRAY_SIZE(mv88e6390_serdes_hw_stats);
@@ -734,7 +734,7 @@ int mv88e6390_serdes_get_strings(struct mv88e6xxx_chip *chip,
struct mv88e6390_serdes_hw_stat *stat;
int i;
- if (mv88e6390_serdes_get_lane(chip, port) < 0)
+ if (mv88e6xxx_serdes_get_lane(chip, port) < 0)
return 0;
for (i = 0; i < ARRAY_SIZE(mv88e6390_serdes_hw_stats); i++) {
@@ -770,7 +770,7 @@ int mv88e6390_serdes_get_stats(struct mv88e6xxx_chip *chip, int port,
int lane;
int i;
- lane = mv88e6390_serdes_get_lane(chip, port);
+ lane = mv88e6xxx_serdes_get_lane(chip, port);
if (lane < 0)
return 0;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a3c680aa2c12c90c44af383fe6882a39875ab81 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Tue, 29 Jun 2021 17:14:19 -0700
Subject: [PATCH] net: bcmgenet: ensure EXT_ENERGY_DET_MASK is clear
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 41f7f078cd27..35e9956e930c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1640,7 +1640,8 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
- reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS);
+ reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS |
+ EXT_ENERGY_DET_MASK);
if (GENET_IS_V5(priv)) {
reg &= ~(EXT_PWR_DOWN_PHY_EN |
EXT_PWR_DOWN_PHY_RD |
@@ -3292,7 +3293,6 @@ static int bcmgenet_open(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
- u32 reg;
int ret;
netif_dbg(priv, ifup, dev, "bcmgenet_open\n");
@@ -3318,12 +3318,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -4139,7 +4133,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
struct bcmgenet_rxnfc_rule *rule;
unsigned long dma_ctrl;
- u32 reg;
int ret;
if (!netif_running(dev))
@@ -4176,12 +4169,6 @@ static int bcmgenet_resume(struct device *d)
if (rule->state != BCMGENET_RXNFC_STATE_UNUSED)
bcmgenet_hfb_create_rxnfc_filter(priv, rule);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index facde824bcaa..e31a5a397f11 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -186,12 +186,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
reg = UMAC_IRQ_MPD_R;
if (hfb_enable)
reg |= UMAC_IRQ_HFB_SM | UMAC_IRQ_HFB_MM;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a3c680aa2c12c90c44af383fe6882a39875ab81 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Tue, 29 Jun 2021 17:14:19 -0700
Subject: [PATCH] net: bcmgenet: ensure EXT_ENERGY_DET_MASK is clear
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 41f7f078cd27..35e9956e930c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1640,7 +1640,8 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
- reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS);
+ reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS |
+ EXT_ENERGY_DET_MASK);
if (GENET_IS_V5(priv)) {
reg &= ~(EXT_PWR_DOWN_PHY_EN |
EXT_PWR_DOWN_PHY_RD |
@@ -3292,7 +3293,6 @@ static int bcmgenet_open(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
- u32 reg;
int ret;
netif_dbg(priv, ifup, dev, "bcmgenet_open\n");
@@ -3318,12 +3318,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -4139,7 +4133,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
struct bcmgenet_rxnfc_rule *rule;
unsigned long dma_ctrl;
- u32 reg;
int ret;
if (!netif_running(dev))
@@ -4176,12 +4169,6 @@ static int bcmgenet_resume(struct device *d)
if (rule->state != BCMGENET_RXNFC_STATE_UNUSED)
bcmgenet_hfb_create_rxnfc_filter(priv, rule);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index facde824bcaa..e31a5a397f11 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -186,12 +186,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
reg = UMAC_IRQ_MPD_R;
if (hfb_enable)
reg |= UMAC_IRQ_HFB_SM | UMAC_IRQ_HFB_MM;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a3c680aa2c12c90c44af383fe6882a39875ab81 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Tue, 29 Jun 2021 17:14:19 -0700
Subject: [PATCH] net: bcmgenet: ensure EXT_ENERGY_DET_MASK is clear
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 41f7f078cd27..35e9956e930c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1640,7 +1640,8 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
- reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS);
+ reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS |
+ EXT_ENERGY_DET_MASK);
if (GENET_IS_V5(priv)) {
reg &= ~(EXT_PWR_DOWN_PHY_EN |
EXT_PWR_DOWN_PHY_RD |
@@ -3292,7 +3293,6 @@ static int bcmgenet_open(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
- u32 reg;
int ret;
netif_dbg(priv, ifup, dev, "bcmgenet_open\n");
@@ -3318,12 +3318,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -4139,7 +4133,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
struct bcmgenet_rxnfc_rule *rule;
unsigned long dma_ctrl;
- u32 reg;
int ret;
if (!netif_running(dev))
@@ -4176,12 +4169,6 @@ static int bcmgenet_resume(struct device *d)
if (rule->state != BCMGENET_RXNFC_STATE_UNUSED)
bcmgenet_hfb_create_rxnfc_filter(priv, rule);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index facde824bcaa..e31a5a397f11 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -186,12 +186,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
reg = UMAC_IRQ_MPD_R;
if (hfb_enable)
reg |= UMAC_IRQ_HFB_SM | UMAC_IRQ_HFB_MM;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a3c680aa2c12c90c44af383fe6882a39875ab81 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Tue, 29 Jun 2021 17:14:19 -0700
Subject: [PATCH] net: bcmgenet: ensure EXT_ENERGY_DET_MASK is clear
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 41f7f078cd27..35e9956e930c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1640,7 +1640,8 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
- reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS);
+ reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS |
+ EXT_ENERGY_DET_MASK);
if (GENET_IS_V5(priv)) {
reg &= ~(EXT_PWR_DOWN_PHY_EN |
EXT_PWR_DOWN_PHY_RD |
@@ -3292,7 +3293,6 @@ static int bcmgenet_open(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
- u32 reg;
int ret;
netif_dbg(priv, ifup, dev, "bcmgenet_open\n");
@@ -3318,12 +3318,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -4139,7 +4133,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
struct bcmgenet_rxnfc_rule *rule;
unsigned long dma_ctrl;
- u32 reg;
int ret;
if (!netif_running(dev))
@@ -4176,12 +4169,6 @@ static int bcmgenet_resume(struct device *d)
if (rule->state != BCMGENET_RXNFC_STATE_UNUSED)
bcmgenet_hfb_create_rxnfc_filter(priv, rule);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index facde824bcaa..e31a5a397f11 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -186,12 +186,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
reg = UMAC_IRQ_MPD_R;
if (hfb_enable)
reg |= UMAC_IRQ_HFB_SM | UMAC_IRQ_HFB_MM;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5a3c680aa2c12c90c44af383fe6882a39875ab81 Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Tue, 29 Jun 2021 17:14:19 -0700
Subject: [PATCH] net: bcmgenet: ensure EXT_ENERGY_DET_MASK is clear
Setting the EXT_ENERGY_DET_MASK bit allows the port energy detection
logic of the internal PHY to prevent the system from sleeping. Some
internal PHYs will report that energy is detected when the network
interface is closed which can prevent the system from going to sleep
if WoL is enabled when the interface is brought down.
Since the driver does not support waking the system on this logic,
this commit clears the bit whenever the internal PHY is powered up
and the other logic for manipulating the bit is removed since it
serves no useful function.
Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Florian Fainelli <f.fainelli(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 41f7f078cd27..35e9956e930c 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1640,7 +1640,8 @@ static void bcmgenet_power_up(struct bcmgenet_priv *priv,
switch (mode) {
case GENET_POWER_PASSIVE:
- reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS);
+ reg &= ~(EXT_PWR_DOWN_DLL | EXT_PWR_DOWN_BIAS |
+ EXT_ENERGY_DET_MASK);
if (GENET_IS_V5(priv)) {
reg &= ~(EXT_PWR_DOWN_PHY_EN |
EXT_PWR_DOWN_PHY_RD |
@@ -3292,7 +3293,6 @@ static int bcmgenet_open(struct net_device *dev)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
unsigned long dma_ctrl;
- u32 reg;
int ret;
netif_dbg(priv, ifup, dev, "bcmgenet_open\n");
@@ -3318,12 +3318,6 @@ static int bcmgenet_open(struct net_device *dev)
bcmgenet_set_hw_addr(priv, dev->dev_addr);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
@@ -4139,7 +4133,6 @@ static int bcmgenet_resume(struct device *d)
struct bcmgenet_priv *priv = netdev_priv(dev);
struct bcmgenet_rxnfc_rule *rule;
unsigned long dma_ctrl;
- u32 reg;
int ret;
if (!netif_running(dev))
@@ -4176,12 +4169,6 @@ static int bcmgenet_resume(struct device *d)
if (rule->state != BCMGENET_RXNFC_STATE_UNUSED)
bcmgenet_hfb_create_rxnfc_filter(priv, rule);
- if (priv->internal_phy) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg |= EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
/* Disable RX/TX DMA and flush TX queues */
dma_ctrl = bcmgenet_dma_disable(priv);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
index facde824bcaa..e31a5a397f11 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c
@@ -186,12 +186,6 @@ int bcmgenet_wol_power_down_cfg(struct bcmgenet_priv *priv,
reg |= CMD_RX_EN;
bcmgenet_umac_writel(priv, reg, UMAC_CMD);
- if (priv->hw_params->flags & GENET_HAS_EXT) {
- reg = bcmgenet_ext_readl(priv, EXT_EXT_PWR_MGMT);
- reg &= ~EXT_ENERGY_DET_MASK;
- bcmgenet_ext_writel(priv, reg, EXT_EXT_PWR_MGMT);
- }
-
reg = UMAC_IRQ_MPD_R;
if (hfb_enable)
reg |= UMAC_IRQ_HFB_SM | UMAC_IRQ_HFB_MM;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f9dfb5e390fab2df9f7944bb91e7705aba14cd26 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Fri, 18 Jun 2021 16:18:25 +0200
Subject: [PATCH] x86/fpu: Make init_fpstate correct with optimized XSAVE
The XSAVE init code initializes all enabled and supported components with
XRSTOR(S) to init state. Then it XSAVEs the state of the components back
into init_fpstate which is used in several places to fill in the init state
of components.
This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because
those use the init optimization and skip writing state of components which
are in init state. So init_fpstate.xsave still contains all zeroes after
this operation.
There are two ways to solve that:
1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when
XSAVES is enabled because XSAVES uses compacted format.
2) Save the components which are known to have a non-zero init state by other
means.
Looking deeper, #2 is the right thing to do because all components the
kernel supports have all-zeroes init state except the legacy features (FP,
SSE). Those cannot be hard coded because the states are not identical on all
CPUs, but they can be saved with FXSAVE which avoids all conditionals.
Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with
a BUILD_BUG_ON() which reminds developers to validate that a newly added
component has all zeroes init state. As a bonus remove the now unused
copy_xregs_to_kernel_booting() crutch.
The XSAVE and reshuffle method can still be implemented in the unlikely
case that components are added which have a non-zero init state and no
other means to save them. For now, FXSAVE is just simple and good enough.
[ bp: Fix a typo or two in the text. ]
Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Reviewed-by: Borislav Petkov <bp(a)suse.de>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20210618143444.587311343@linutronix.de
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index fdee23ea4e17..16bf4d4a8159 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -204,6 +204,14 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu)
asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
}
+static inline void fxsave(struct fxregs_state *fx)
+{
+ if (IS_ENABLED(CONFIG_X86_32))
+ asm volatile( "fxsave %[fx]" : [fx] "=m" (*fx));
+ else
+ asm volatile("fxsaveq %[fx]" : [fx] "=m" (*fx));
+}
+
/* These macros all use (%edi)/(%rdi) as the single memory argument. */
#define XSAVE ".byte " REX_PREFIX "0x0f,0xae,0x27"
#define XSAVEOPT ".byte " REX_PREFIX "0x0f,0xae,0x37"
@@ -268,28 +276,6 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu)
: "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \
: "memory")
-/*
- * This function is called only during boot time when x86 caps are not set
- * up and alternative can not be used yet.
- */
-static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
-{
- u64 mask = xfeatures_mask_all;
- u32 lmask = mask;
- u32 hmask = mask >> 32;
- int err;
-
- WARN_ON(system_state != SYSTEM_BOOTING);
-
- if (boot_cpu_has(X86_FEATURE_XSAVES))
- XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
- else
- XSTATE_OP(XSAVE, xstate, lmask, hmask, err);
-
- /* We should never fault when copying to a kernel buffer: */
- WARN_ON_FPU(err);
-}
-
/*
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d0eef963aad1..1cadb2faf740 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -440,6 +440,25 @@ static void __init print_xstate_offset_size(void)
}
}
+/*
+ * All supported features have either init state all zeros or are
+ * handled in setup_init_fpu() individually. This is an explicit
+ * feature list and does not use XFEATURE_MASK*SUPPORTED to catch
+ * newly added supported features at build time and make people
+ * actually look at the init state for the new feature.
+ */
+#define XFEATURES_INIT_FPSTATE_HANDLED \
+ (XFEATURE_MASK_FP | \
+ XFEATURE_MASK_SSE | \
+ XFEATURE_MASK_YMM | \
+ XFEATURE_MASK_OPMASK | \
+ XFEATURE_MASK_ZMM_Hi256 | \
+ XFEATURE_MASK_Hi16_ZMM | \
+ XFEATURE_MASK_PKRU | \
+ XFEATURE_MASK_BNDREGS | \
+ XFEATURE_MASK_BNDCSR | \
+ XFEATURE_MASK_PASID)
+
/*
* setup the xstate image representing the init state
*/
@@ -447,6 +466,10 @@ static void __init setup_init_fpu_buf(void)
{
static int on_boot_cpu __initdata = 1;
+ BUILD_BUG_ON((XFEATURE_MASK_USER_SUPPORTED |
+ XFEATURE_MASK_SUPERVISOR_SUPPORTED) !=
+ XFEATURES_INIT_FPSTATE_HANDLED);
+
WARN_ON_FPU(!on_boot_cpu);
on_boot_cpu = 0;
@@ -466,10 +489,22 @@ static void __init setup_init_fpu_buf(void)
copy_kernel_to_xregs_booting(&init_fpstate.xsave);
/*
- * Dump the init state again. This is to identify the init state
- * of any feature which is not represented by all zero's.
+ * All components are now in init state. Read the state back so
+ * that init_fpstate contains all non-zero init state. This only
+ * works with XSAVE, but not with XSAVEOPT and XSAVES because
+ * those use the init optimization which skips writing data for
+ * components in init state.
+ *
+ * XSAVE could be used, but that would require to reshuffle the
+ * data when XSAVES is available because XSAVES uses xstate
+ * compaction. But doing so is a pointless exercise because most
+ * components have an all zeros init state except for the legacy
+ * ones (FP and SSE). Those can be saved with FXSAVE into the
+ * legacy area. Adding new features requires to ensure that init
+ * state is all zeroes or if not to add the necessary handling
+ * here.
*/
- copy_xregs_to_kernel_booting(&init_fpstate.xsave);
+ fxsave(&init_fpstate.fxsave);
}
static int xfeature_uncompacted_offset(int xfeature_nr)
The commit d38a2b7a9c93 ("mm: memcg/slab: fix memory leak at non-root
kmem_cache destroy") introduced a problem: If one thread destroy a
kmem_cache A and another thread concurrently create a kmem_cache B,
which is mergeable with A and has same size with A, the B may fail to
create due to the duplicate sysfs node.
The scenario in detail:
1) Thread 1 uses kmem_cache_destroy() to destroy kmem_cache A which is
mergeable, it decreases A's refcount and if refcount is 0, then call
memcg_set_kmem_cache_dying() which set A->memcg_params.dying = true,
then unlock the slab_mutex and call flush_memcg_workqueue(), it may cost
a while.
Note: now the sysfs node(like '/kernel/slab/:0000248') of A is still
present, it will be deleted in shutdown_cache() which will be called
after flush_memcg_workqueue() is done and lock the slab_mutex again.
2) Now if thread 2 is coming, it use kmem_cache_create() to create B, which
is mergeable with A(their size is same), it gain the lock of slab_mutex,
then call __kmem_cache_alias() trying to find a mergeable node, because
of the below added code in commit d38a2b7a9c93 ("mm: memcg/slab: fix
memory leak at non-root kmem_cache destroy"), B is not mergeable with
A whose memcg_params.dying is true.
int slab_unmergeable(struct kmem_cache *s)
if (s->refcount < 0)
return 1;
/*
* Skip the dying kmem_cache.
*/
if (s->memcg_params.dying)
return 1;
return 0;
}
So B has to create its own sysfs node by calling:
create_cache->
__kmem_cache_create->
sysfs_slab_add->
kobject_init_and_add
Because B is mergeable itself, its filename of sysfs node is based on its size,
like '/kernel/slab/:0000248', which is duplicate with A, and the sysfs
node of A is still present now, so kobject_init_and_add() will return
fail and result in kmem_cache_create() fail.
Concurrently modprobe and rmmod the two modules below can reproduce the issue
quickly: nf_conntrack_expect, se_sess_cache. See call trace in the end.
LTS versions of v4.19.y and v5.4.y have this problem, whereas linux versions after
v5.9 do not have this problem because the patchset: ("The new cgroup slab memory
controller") almost refactored memcg slab.
A potential solution(this patch belongs): Just let the dying kmem_cache be mergeable,
the slab_mutex lock can prevent the race between alias kmem_cache creating thread
and root kmem_cache destroying thread. In the destroying thread, after
flush_memcg_workqueue() is done, judge the refcount again, if someone
reference it again during un-lock time, we don't need to destroy the kmem_cache
completely, we can reuse it.
Another potential solution: revert the commit d38a2b7a9c93 ("mm: memcg/slab:
fix memory leak at non-root kmem_cache destroy"), compare to the fail of
kmem_cache_create, the memory leak in special scenario seems less harmful.
Call trace:
sysfs: cannot create duplicate filename '/kernel/slab/:0000248'
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Call trace:
dump_backtrace+0x0/0x198
show_stack+0x24/0x30
dump_stack+0xb0/0x100
sysfs_warn_dup+0x6c/0x88
sysfs_create_dir_ns+0x104/0x120
kobject_add_internal+0xd0/0x378
kobject_init_and_add+0x90/0xd8
sysfs_slab_add+0x16c/0x2d0
__kmem_cache_create+0x16c/0x1d8
create_cache+0xbc/0x1f8
kmem_cache_create_usercopy+0x1a0/0x230
kmem_cache_create+0x50/0x68
init_se_kmem_caches+0x38/0x258 [target_core_mod]
target_core_init_configfs+0x8c/0x390 [target_core_mod]
do_one_initcall+0x54/0x230
do_init_module+0x64/0x1ec
load_module+0x150c/0x16f0
__se_sys_finit_module+0xf0/0x108
__arm64_sys_finit_module+0x24/0x30
el0_svc_common+0x80/0x1c0
el0_svc_handler+0x78/0xe0
el0_svc+0x10/0x260
kobject_add_internal failed for :0000248 with -EEXIST, don't try to register things with the same name in the same directory.
kmem_cache_create(se_sess_cache) failed with error -17
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Call trace:
dump_backtrace+0x0/0x198
show_stack+0x24/0x30
dump_stack+0xb0/0x100
kmem_cache_create_usercopy+0xa8/0x230
kmem_cache_create+0x50/0x68
init_se_kmem_caches+0x38/0x258 [target_core_mod]
target_core_init_configfs+0x8c/0x390 [target_core_mod]
do_one_initcall+0x54/0x230
do_init_module+0x64/0x1ec
load_module+0x150c/0x16f0
__se_sys_finit_module+0xf0/0x108
__arm64_sys_finit_module+0x24/0x30
el0_svc_common+0x80/0x1c0
el0_svc_handler+0x78/0xe0
el0_svc+0x10/0x260
Fixes: d38a2b7a9c93 ("mm: memcg/slab: fix memory leak at non-root kmem_cache destroy")
Signed-off-by: Nanyong Sun <sunnanyong(a)huawei.com>
Cc: stable(a)vger.kernel.org
---
mm/slab_common.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d208b47e01a8..acc743315bb5 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -326,14 +326,6 @@ int slab_unmergeable(struct kmem_cache *s)
if (s->refcount < 0)
return 1;
-#ifdef CONFIG_MEMCG_KMEM
- /*
- * Skip the dying kmem_cache.
- */
- if (s->memcg_params.dying)
- return 1;
-#endif
-
return 0;
}
@@ -947,6 +939,16 @@ void kmem_cache_destroy(struct kmem_cache *s)
get_online_mems();
mutex_lock(&slab_mutex);
+
+ /*
+ *Another thread referenced it again
+ */
+ if (READ_ONCE(s->refcount)) {
+ spin_lock_irq(&memcg_kmem_wq_lock);
+ s->memcg_params.dying = false;
+ spin_unlock_irq(&memcg_kmem_wq_lock);
+ goto out_unlock;
+ }
#endif
err = shutdown_memcg_caches(s);
--
2.18.0.huawei.25
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 122e093c1734361dedb64f65c99b93e28e4624f4 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt(a)kernel.org>
Date: Mon, 28 Jun 2021 19:33:26 -0700
Subject: [PATCH] mm/page_alloc: fix memory map initialization for descending
nodes
On systems with memory nodes sorted in descending order, for instance Dell
Precision WorkStation T5500, the struct pages for higher PFNs and
respectively lower nodes, could be overwritten by the initialization of
struct pages corresponding to the holes in the memory sections.
For example for the below memory layout
[ 0.245624] Early memory node ranges
[ 0.248496] node 1: [mem 0x0000000000001000-0x0000000000090fff]
[ 0.251376] node 1: [mem 0x0000000000100000-0x00000000dbdf8fff]
[ 0.254256] node 1: [mem 0x0000000100000000-0x0000001423ffffff]
[ 0.257144] node 0: [mem 0x0000001424000000-0x0000002023ffffff]
the range 0x1424000000 - 0x1428000000 in the beginning of node 0 starts in
the middle of a section and will be considered as a hole during the
initialization of the last section in node 1.
The wrong initialization of the memory map causes panic on boot when
CONFIG_DEBUG_VM is enabled.
Reorder loop order of the memory map initialization so that the outer loop
will always iterate over populated memory regions in the ascending order
and the inner loop will select the zone corresponding to the PFN range.
This way initialization of the struct pages for the memory holes will be
always done for the ranges that are actually not populated.
[akpm(a)linux-foundation.org: coding style fixes]
Link: https://lkml.kernel.org/r/YNXlMqBbL+tBG7yq@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213073
Link: https://lkml.kernel.org/r/20210624062305.10940-1-rppt@kernel.org
Fixes: 0740a50b9baa ("mm/page_alloc.c: refactor initialization of struct page for holes in memory layout")
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Boris Petkov <bp(a)alien8.de>
Cc: Robert Shteynfeld <robert.shteynfeld(a)gmail.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8ae31622deef..9afb8998e7e5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2474,7 +2474,6 @@ extern void set_dma_reserve(unsigned long new_dma_reserve);
extern void memmap_init_range(unsigned long, int, unsigned long,
unsigned long, unsigned long, enum meminit_context,
struct vmem_altmap *, int migratetype);
-extern void memmap_init_zone(struct zone *zone);
extern void setup_per_zone_wmarks(void);
extern int __meminit init_per_zone_wmark_min(void);
extern void mem_init(void);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ef2265f86b91..5b5c9f5813b9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6400,7 +6400,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
return;
/*
- * The call to memmap_init_zone should have already taken care
+ * The call to memmap_init should have already taken care
* of the pages reserved for the memmap, so we can just jump to
* the end of that region and start processing the device pages.
*/
@@ -6465,7 +6465,7 @@ static void __meminit zone_init_free_lists(struct zone *zone)
/*
* Only struct pages that correspond to ranges defined by memblock.memory
* are zeroed and initialized by going through __init_single_page() during
- * memmap_init_zone().
+ * memmap_init_zone_range().
*
* But, there could be struct pages that correspond to holes in
* memblock.memory. This can happen because of the following reasons:
@@ -6484,9 +6484,9 @@ static void __meminit zone_init_free_lists(struct zone *zone)
* zone/node above the hole except for the trailing pages in the last
* section that will be appended to the zone/node below.
*/
-static u64 __meminit init_unavailable_range(unsigned long spfn,
- unsigned long epfn,
- int zone, int node)
+static void __init init_unavailable_range(unsigned long spfn,
+ unsigned long epfn,
+ int zone, int node)
{
unsigned long pfn;
u64 pgcnt = 0;
@@ -6502,56 +6502,77 @@ static u64 __meminit init_unavailable_range(unsigned long spfn,
pgcnt++;
}
- return pgcnt;
+ if (pgcnt)
+ pr_info("On node %d, zone %s: %lld pages in unavailable ranges",
+ node, zone_names[zone], pgcnt);
}
#else
-static inline u64 init_unavailable_range(unsigned long spfn, unsigned long epfn,
- int zone, int node)
+static inline void init_unavailable_range(unsigned long spfn,
+ unsigned long epfn,
+ int zone, int node)
{
- return 0;
}
#endif
-void __meminit __weak memmap_init_zone(struct zone *zone)
+static void __init memmap_init_zone_range(struct zone *zone,
+ unsigned long start_pfn,
+ unsigned long end_pfn,
+ unsigned long *hole_pfn)
{
unsigned long zone_start_pfn = zone->zone_start_pfn;
unsigned long zone_end_pfn = zone_start_pfn + zone->spanned_pages;
- int i, nid = zone_to_nid(zone), zone_id = zone_idx(zone);
- static unsigned long hole_pfn;
+ int nid = zone_to_nid(zone), zone_id = zone_idx(zone);
+
+ start_pfn = clamp(start_pfn, zone_start_pfn, zone_end_pfn);
+ end_pfn = clamp(end_pfn, zone_start_pfn, zone_end_pfn);
+
+ if (start_pfn >= end_pfn)
+ return;
+
+ memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn,
+ zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
+
+ if (*hole_pfn < start_pfn)
+ init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid);
+
+ *hole_pfn = end_pfn;
+}
+
+static void __init memmap_init(void)
+{
unsigned long start_pfn, end_pfn;
- u64 pgcnt = 0;
+ unsigned long hole_pfn = 0;
+ int i, j, zone_id, nid;
- for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
- start_pfn = clamp(start_pfn, zone_start_pfn, zone_end_pfn);
- end_pfn = clamp(end_pfn, zone_start_pfn, zone_end_pfn);
+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
+ struct pglist_data *node = NODE_DATA(nid);
+
+ for (j = 0; j < MAX_NR_ZONES; j++) {
+ struct zone *zone = node->node_zones + j;
- if (end_pfn > start_pfn)
- memmap_init_range(end_pfn - start_pfn, nid,
- zone_id, start_pfn, zone_end_pfn,
- MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
+ if (!populated_zone(zone))
+ continue;
- if (hole_pfn < start_pfn)
- pgcnt += init_unavailable_range(hole_pfn, start_pfn,
- zone_id, nid);
- hole_pfn = end_pfn;
+ memmap_init_zone_range(zone, start_pfn, end_pfn,
+ &hole_pfn);
+ zone_id = j;
+ }
}
#ifdef CONFIG_SPARSEMEM
/*
- * Initialize the hole in the range [zone_end_pfn, section_end].
- * If zone boundary falls in the middle of a section, this hole
- * will be re-initialized during the call to this function for the
- * higher zone.
+ * Initialize the memory map for hole in the range [memory_end,
+ * section_end].
+ * Append the pages in this hole to the highest zone in the last
+ * node.
+ * The call to init_unavailable_range() is outside the ifdef to
+ * silence the compiler warining about zone_id set but not used;
+ * for FLATMEM it is a nop anyway
*/
- end_pfn = round_up(zone_end_pfn, PAGES_PER_SECTION);
+ end_pfn = round_up(end_pfn, PAGES_PER_SECTION);
if (hole_pfn < end_pfn)
- pgcnt += init_unavailable_range(hole_pfn, end_pfn,
- zone_id, nid);
#endif
-
- if (pgcnt)
- pr_info(" %s zone: %llu pages in unavailable ranges\n",
- zone->name, pgcnt);
+ init_unavailable_range(hole_pfn, end_pfn, zone_id, nid);
}
static int zone_batchsize(struct zone *zone)
@@ -7254,7 +7275,6 @@ static void __init free_area_init_core(struct pglist_data *pgdat)
set_pageblock_order();
setup_usemap(zone);
init_currently_empty_zone(zone, zone->zone_start_pfn, size);
- memmap_init_zone(zone);
}
}
@@ -7780,6 +7800,8 @@ void __init free_area_init(unsigned long *max_zone_pfn)
node_set_state(nid, N_MEMORY);
check_for_memory(pgdat, nid);
}
+
+ memmap_init();
}
static int __init cmdline_parse_core(char *p, unsigned long *core,
In summary, this series should be needed for 5.10/5.12/5.13. This is the 5.10.y
backport of the series. Patch 1 is a dependency of patch 2, while patch 2
should be the real fix.
There's a minor conflict on patch 2 when cherry pick due to not having the new
helper called page_needs_cow_for_dma(). It's also mentioned at the entry of
patch 2.
This series should be able to fix a rare race that mentioned in thread:
https://lore.kernel.org/linux-mm/796cbb7-5a1c-1ba0-dde5-479aba8224f2@google…
This fact wasn't discovered when the fix got proposed and merged, because the
fix was originally about uffd-wp and its fork event. However it turns out that
the problematic commit b569a1760782f3d is also causing crashing on fork() of
pmd migration entries which is even more severe than the original uffd-wp
problem.
Stable kernels at least on 5.12.y has the crash reproduced, and it's possible
5.13.y and 5.10.y could hit it due to having the problematic commit
b569a1760782f3d but lacking of the uffd-wp fix patch (8f34f1eac382, which is
also patch 2 of this series).
The pmd entry crash problem was reported by Igor Raits <igor(a)gooddata.com> and
debugged by Hugh Dickins <hughd(a)google.com>.
Please review, thanks.
Peter Xu (2):
mm/thp: simplify copying of huge zero page pmd when fork
mm/userfaultfd: fix uffd-wp special cases for fork()
include/linux/huge_mm.h | 2 +-
include/linux/swapops.h | 2 ++
mm/huge_memory.c | 36 +++++++++++++++++-------------------
mm/memory.c | 25 +++++++++++++------------
4 files changed, 33 insertions(+), 32 deletions(-)
--
2.31.1
In summary: this series should be needed for 5.10/5.12/5.13. This is the
5.13.y/5.12.y backport of the series, and it should be able to be applied on
both of the branches. Patch 1 is a dependency of patch 2, while patch 2 should
be the real fix.
This series should be able to fix a rare race that mentioned in thread:
https://lore.kernel.org/linux-mm/796cbb7-5a1c-1ba0-dde5-479aba8224f2@google…
This fact wasn't discovered when the fix got proposed and merged, because the
fix was originally about uffd-wp and its fork event. However it turns out that
the problematic commit b569a1760782f3d is also causing crashing on fork() of
pmd migration entries which is even more severe than the original uffd-wp
problem.
Stable kernels at least on 5.12.y has the crash reproduced, and it's possible
5.13.y and 5.10.y could hit it due to having the problematic commit
b569a1760782f3d but lacking of the uffd-wp fix patch (8f34f1eac382, which is
also patch 2 of this series).
The pmd entry crash problem was reported by Igor Raits <igor(a)gooddata.com> and
debugged by Hugh Dickins <hughd(a)google.com>.
Please review, thanks.
Peter Xu (2):
mm/thp: simplify copying of huge zero page pmd when fork
mm/userfaultfd: fix uffd-wp special cases for fork()
include/linux/huge_mm.h | 2 +-
include/linux/swapops.h | 2 ++
mm/huge_memory.c | 36 +++++++++++++++++-------------------
mm/memory.c | 25 +++++++++++++------------
4 files changed, 33 insertions(+), 32 deletions(-)
--
2.31.1
Hi,
As of v4.4.276, linux-4.4.y fails to build powerpc:mpc85xx_defconfig and
other powerpc images enabling the fsl_ifc driver.
drivers/memory/fsl_ifc.c: In function 'fsl_ifc_ctrl_probe':
drivers/memory/fsl_ifc.c:308:28: error:
'struct fsl_ifc_ctrl' has no member named 'gregs'; did you mean 'regs'?
Guenter
Hello,
I write again the content of my last E-mail to you, for my smartphone has got
only a small display and therefore a quit tiny keyboard...
Whenever Tor from rosa2016.1 (Rosalabs.ru) and/or Enterprise Linux 6 is
starting, at least three kernel-processes start too and/or become more than
active (CPU capacity lies by more than 10 percent, making my orange LED on the
tower blinking a lot)!
So they have got root-rights!
Their name is like kworker-kcryptd/..., dmcrypt/... and uksmd, but this is not
their complete name.
I do not want this activity as they irritate a lot in possibly causing high
and highest risks!
Tests have shown, that no data seems to get transferred over the net, but
knows the hell, what's really going on here....
Is it worth a patch for you,
can you resp. your foundation patch it?
I would appreciate this.
Regards, Andreas
I also like to tell you, that this Tor is surrounded by firejail from pclos,
this is PCLinuxOS 2021. The processes get permanently highly active for about
five (5) minutes. The LED on the tower for read and write on storage media is
really blinking madly hard all this time, while the process-manager confirms
the whole mysterious process-activities!
After this time, Tor and surfing in the internet with the browser do work fine
as much as the rest of the Linux. So I have to wait several minutes.
Does it have to do with one of the last patches of 5.4, for example the one
for dmcrypt? Anyhow I use full system encryption by LUKS, that might matte
here.
Regards,
Andreas Stöwing
(Gooken)
https://gooken.safe-ws.de/gooken
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: stkwebcam: fix memory leak in stk_camera_probe
Author: Pavel Skripkin <paskripkin(a)gmail.com>
Date: Wed Jul 7 19:54:30 2021 +0200
My local syzbot instance hit memory leak in usb_set_configuration().
The problem was in unputted usb interface. In case of errors after
usb_get_intf() the reference should be putted to correclty free memory
allocated for this interface.
Fixes: ec16dae5453e ("V4L/DVB (7019): V4L: add support for Syntek DC1125 webcams")
Cc: stable(a)vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/usb/stkwebcam/stk-webcam.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
---
diff --git a/drivers/media/usb/stkwebcam/stk-webcam.c b/drivers/media/usb/stkwebcam/stk-webcam.c
index a45d464427c4..0e231e576dc3 100644
--- a/drivers/media/usb/stkwebcam/stk-webcam.c
+++ b/drivers/media/usb/stkwebcam/stk-webcam.c
@@ -1346,7 +1346,7 @@ static int stk_camera_probe(struct usb_interface *interface,
if (!dev->isoc_ep) {
pr_err("Could not find isoc-in endpoint\n");
err = -ENODEV;
- goto error;
+ goto error_put;
}
dev->vsettings.palette = V4L2_PIX_FMT_RGB565;
dev->vsettings.mode = MODE_VGA;
@@ -1359,10 +1359,12 @@ static int stk_camera_probe(struct usb_interface *interface,
err = stk_register_video_device(dev);
if (err)
- goto error;
+ goto error_put;
return 0;
+error_put:
+ usb_put_intf(interface);
error:
v4l2_ctrl_handler_free(hdl);
v4l2_device_unregister(&dev->v4l2_dev);
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: stkwebcam: fix memory leak in stk_camera_probe
Author: Pavel Skripkin <paskripkin(a)gmail.com>
Date: Wed Jul 7 19:54:30 2021 +0200
My local syzbot instance hit memory leak in usb_set_configuration().
The problem was in unputted usb interface. In case of errors after
usb_get_intf() the reference should be putted to correclty free memory
allocated for this interface.
Fixes: ec16dae5453e ("V4L/DVB (7019): V4L: add support for Syntek DC1125 webcams")
Cc: stable(a)vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/usb/stkwebcam/stk-webcam.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
---
diff --git a/drivers/media/usb/stkwebcam/stk-webcam.c b/drivers/media/usb/stkwebcam/stk-webcam.c
index a45d464427c4..0e231e576dc3 100644
--- a/drivers/media/usb/stkwebcam/stk-webcam.c
+++ b/drivers/media/usb/stkwebcam/stk-webcam.c
@@ -1346,7 +1346,7 @@ static int stk_camera_probe(struct usb_interface *interface,
if (!dev->isoc_ep) {
pr_err("Could not find isoc-in endpoint\n");
err = -ENODEV;
- goto error;
+ goto error_put;
}
dev->vsettings.palette = V4L2_PIX_FMT_RGB565;
dev->vsettings.mode = MODE_VGA;
@@ -1359,10 +1359,12 @@ static int stk_camera_probe(struct usb_interface *interface,
err = stk_register_video_device(dev);
if (err)
- goto error;
+ goto error_put;
return 0;
+error_put:
+ usb_put_intf(interface);
error:
v4l2_ctrl_handler_free(hdl);
v4l2_device_unregister(&dev->v4l2_dev);
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: stkwebcam: fix memory leak in stk_camera_probe
Author: Pavel Skripkin <paskripkin(a)gmail.com>
Date: Wed Jul 7 19:54:30 2021 +0200
My local syzbot instance hit memory leak in usb_set_configuration().
The problem was in unputted usb interface. In case of errors after
usb_get_intf() the reference should be putted to correclty free memory
allocated for this interface.
Fixes: ec16dae5453e ("V4L/DVB (7019): V4L: add support for Syntek DC1125 webcams")
Cc: stable(a)vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/usb/stkwebcam/stk-webcam.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
---
diff --git a/drivers/media/usb/stkwebcam/stk-webcam.c b/drivers/media/usb/stkwebcam/stk-webcam.c
index a45d464427c4..0e231e576dc3 100644
--- a/drivers/media/usb/stkwebcam/stk-webcam.c
+++ b/drivers/media/usb/stkwebcam/stk-webcam.c
@@ -1346,7 +1346,7 @@ static int stk_camera_probe(struct usb_interface *interface,
if (!dev->isoc_ep) {
pr_err("Could not find isoc-in endpoint\n");
err = -ENODEV;
- goto error;
+ goto error_put;
}
dev->vsettings.palette = V4L2_PIX_FMT_RGB565;
dev->vsettings.mode = MODE_VGA;
@@ -1359,10 +1359,12 @@ static int stk_camera_probe(struct usb_interface *interface,
err = stk_register_video_device(dev);
if (err)
- goto error;
+ goto error_put;
return 0;
+error_put:
+ usb_put_intf(interface);
error:
v4l2_ctrl_handler_free(hdl);
v4l2_device_unregister(&dev->v4l2_dev);
"Hello,
We request you to add a guest post pertinent to the betting site with
dofollow links on your website planet.kernel.org. We hope to deserve
the offer,if so how much do you charge for each guest post (Or) Blog
posts?
Thanks.
Best Regards "
Commit dd2d082b5760 ("riscv: Cleanup setup_bootmem()") adjusted
the calling sequence in setup_bootmem(), which invalidates the fix
commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
did for 32-bit RISC-V unfortunately.
So now 32-bit RISC-V does not boot again when testing booting kernel
on QEMU 'virt' with '-m 2G', which was exactly what the original
commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
tried to fix.
Fixes: dd2d082b5760 ("riscv: Cleanup setup_bootmem()")
Cc: stable(a)vger.kernel.org # v5.12+
Signed-off-by: Bin Meng <bmeng.cn(a)gmail.com>
---
arch/riscv/mm/init.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 4c4c92ce0bb8..9b23b95c50cf 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -123,7 +123,7 @@ void __init setup_bootmem(void)
{
phys_addr_t vmlinux_end = __pa_symbol(&_end);
phys_addr_t vmlinux_start = __pa_symbol(&_start);
- phys_addr_t dram_end = memblock_end_of_DRAM();
+ phys_addr_t dram_end;
phys_addr_t max_mapped_addr = __pa(~(ulong)0);
#ifdef CONFIG_XIP_KERNEL
@@ -146,6 +146,8 @@ void __init setup_bootmem(void)
#endif
memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
+ dram_end = memblock_end_of_DRAM();
+
/*
* memblock allocator is not aware of the fact that last 4K bytes of
* the addressable memory can not be mapped because of IS_ERR_VALUE
--
2.25.1
This is a note to let you know that I've just added the patch titled
tty: Fix out-of-bound vmalloc access in imageblit
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 3b0c406124719b625b1aba431659f5cdc24a982c Mon Sep 17 00:00:00 2001
From: Igor Matheus Andrade Torrente <igormtorrente(a)gmail.com>
Date: Mon, 28 Jun 2021 10:45:09 -0300
Subject: tty: Fix out-of-bound vmalloc access in imageblit
This issue happens when a userspace program does an ioctl
FBIOPUT_VSCREENINFO passing the fb_var_screeninfo struct
containing only the fields xres, yres, and bits_per_pixel
with values.
If this struct is the same as the previous ioctl, the
vc_resize() detects it and doesn't call the resize_screen(),
leaving the fb_var_screeninfo incomplete. And this leads to
the updatescrollmode() calculates a wrong value to
fbcon_display->vrows, which makes the real_y() return a
wrong value of y, and that value, eventually, causes
the imageblit to access an out-of-bound address value.
To solve this issue I made the resize_screen() be called
even if the screen does not need any resizing, so it will
"fix and fill" the fb_var_screeninfo independently.
Cc: stable <stable(a)vger.kernel.org> # after 5.15-rc2 is out, give it time to bake
Reported-and-tested-by: syzbot+858dc7a2f7ef07c2c219(a)syzkaller.appspotmail.com
Signed-off-by: Igor Matheus Andrade Torrente <igormtorrente(a)gmail.com>
Link: https://lore.kernel.org/r/20210628134509.15895-1-igormtorrente@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index ef981d3b7bb4..484eda1958f5 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1219,8 +1219,25 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
new_row_size = new_cols << 1;
new_screen_size = new_row_size * new_rows;
- if (new_cols == vc->vc_cols && new_rows == vc->vc_rows)
- return 0;
+ if (new_cols == vc->vc_cols && new_rows == vc->vc_rows) {
+ /*
+ * This function is being called here to cover the case
+ * where the userspace calls the FBIOPUT_VSCREENINFO twice,
+ * passing the same fb_var_screeninfo containing the fields
+ * yres/xres equal to a number non-multiple of vc_font.height
+ * and yres_virtual/xres_virtual equal to number lesser than the
+ * vc_font.height and yres/xres.
+ * In the second call, the struct fb_var_screeninfo isn't
+ * being modified by the underlying driver because of the
+ * if above, and this causes the fbcon_display->vrows to become
+ * negative and it eventually leads to out-of-bound
+ * access by the imageblit function.
+ * To give the correct values to the struct and to not have
+ * to deal with possible errors from the code below, we call
+ * the resize_screen here as well.
+ */
+ return resize_screen(vc, new_cols, new_rows, user);
+ }
if (new_screen_size > KMALLOC_MAX_SIZE || !new_screen_size)
return -EINVAL;
--
2.32.0
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: rc-loopback: return number of emitters rather than error
Author: Sean Young <sean(a)mess.org>
Date: Sat Jul 3 15:37:17 2021 +0200
The LIRC_SET_TRANSMITTER_MASK ioctl should return the number of emitters
if an invalid list was set.
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Young <sean(a)mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/rc/rc-loopback.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/drivers/media/rc/rc-loopback.c b/drivers/media/rc/rc-loopback.c
index 1ba3f96ffa7d..40ab66c850f2 100644
--- a/drivers/media/rc/rc-loopback.c
+++ b/drivers/media/rc/rc-loopback.c
@@ -42,7 +42,7 @@ static int loop_set_tx_mask(struct rc_dev *dev, u32 mask)
if ((mask & (RXMASK_REGULAR | RXMASK_LEARNING)) != mask) {
dprintk("invalid tx mask: %u\n", mask);
- return -EINVAL;
+ return 2;
}
dprintk("setting tx mask: %u\n", mask);
The patch titled
Subject: mm: fix the deadlock in finish_fault()
has been added to the -mm tree. Its filename is
mm-fix-the-deadlock-in-finish_fault.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-fix-the-deadlock-in-finish_fau…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-the-deadlock-in-finish_fau…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Qi Zheng <zhengqi.arch(a)bytedance.com>
Subject: mm: fix the deadlock in finish_fault()
Commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback") fix
the following ABBA deadlock by pre-allocating the pte page table without
holding the page lock.
lock_page(A)
SetPageWriteback(A)
unlock_page(A)
lock_page(B)
lock_page(B)
pte_alloc_one
shrink_page_list
wait_on_page_writeback(A)
SetPageWriteback(B)
unlock_page(B)
# flush A, B to clear the writeback
Commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") rework the relevant code but ignore this race. This will
cause the deadlock above to appear again, so fix it.
Link: https://lkml.kernel.org/r/20210721074849.57004-1-zhengqi.arch@bytedance.com
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Qi Zheng <zhengqi.arch(a)bytedance.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
--- a/mm/memory.c~mm-fix-the-deadlock-in-finish_fault
+++ a/mm/memory.c
@@ -4026,8 +4026,17 @@ vm_fault_t finish_fault(struct vm_fault
return ret;
}
- if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd)))
+ if (vmf->prealloc_pte) {
+ vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
+ if (likely(pmd_none(*vmf->pmd))) {
+ mm_inc_nr_ptes(vma->vm_mm);
+ pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte);
+ vmf->prealloc_pte = NULL;
+ }
+ spin_unlock(vmf->ptl);
+ } else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) {
return VM_FAULT_OOM;
+ }
}
/* See comment in handle_pte_fault() */
_
Patches currently in -mm which might be from zhengqi.arch(a)bytedance.com are
mm-fix-the-deadlock-in-finish_fault.patch
This reverts commit 3d5bfbd9716318b1ca5c38488aa69f64d38a9aa5.
When booting with threadirqs, it causes a splat
WARNING: CPU: 0 PID: 29 at kernel/irq/handle.c:159 __handle_irq_event_percpu+0x1ec/0x27c
irq 66 handler irq_default_primary_handler+0x0/0x1c enabled interrupts
That splat later went away with commit 81e2073c175b ("genirq: Disable
interrupts for force threaded handlers"), which got backported to
-stable. However, when running an -rt kernel, the splat still
exists. Moreover, quoting Thomas Gleixner [1]
But 3d5bfbd97163 ("gpio: mpc8xxx: change the gpio interrupt flags.")
has nothing to do with that:
"Delete the interrupt IRQF_NO_THREAD flags in order to gpio interrupts
can be threaded to allow high-priority processes to preempt."
This changelog is blatantly wrong. In mainline forced irq threads
have always been invoked with softirqs disabled, which obviously
makes them non-preemptible.
So the patch didn't even do what its commit log said.
[1] https://lore.kernel.org/lkml/871r8zey88.ffs@nanos.tec.linutronix.de/
Cc: stable(a)vger.kernel.org # v5.9+
Signed-off-by: Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
---
Thomas, please correct me if I misinterpreted your explanation.
drivers/gpio/gpio-mpc8xxx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c
index 4b9157a69fca..50b321a1ab1b 100644
--- a/drivers/gpio/gpio-mpc8xxx.c
+++ b/drivers/gpio/gpio-mpc8xxx.c
@@ -405,7 +405,7 @@ static int mpc8xxx_probe(struct platform_device *pdev)
ret = devm_request_irq(&pdev->dev, mpc8xxx_gc->irqn,
mpc8xxx_gpio_irq_cascade,
- IRQF_SHARED, "gpio-cascade",
+ IRQF_NO_THREAD | IRQF_SHARED, "gpio-cascade",
mpc8xxx_gc);
if (ret) {
dev_err(&pdev->dev,
--
2.31.1
In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing
of the mount mode string was changed from match_octal() to fsparam_u32.
This changed existing behavior as match_octal does not require octal
values to have a '0' prefix, but fsparam_u32 does.
Use fsparam_u32oct which provides the same behavior as match_octal.
Reported-by: Dennis Camera <bugs+kernel.org(a)dtnr.ch>
Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
CC: <stable(a)vger.kernel.org>
---
fs/hugetlbfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 926eeb9bf4eb..cdfb1ae78a3f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -77,7 +77,7 @@ enum hugetlb_param {
static const struct fs_parameter_spec hugetlb_fs_parameters[] = {
fsparam_u32 ("gid", Opt_gid),
fsparam_string("min_size", Opt_min_size),
- fsparam_u32 ("mode", Opt_mode),
+ fsparam_u32oct("mode", Opt_mode),
fsparam_string("nr_inodes", Opt_nr_inodes),
fsparam_string("pagesize", Opt_pagesize),
fsparam_string("size", Opt_size),
--
2.31.1
The revert of commit 26fd962 missed out on reverting an incorrect change
to a return value. The niu_pci_vpd_scan_props(..) == 1 case appears to
be a normal path - treating it as an error and return -EINVAL was
breaking VPD_SCAN and causing the driver to fail to load.
Fix it, so my Neptune card works again.
Cc: Kangjie Lu <kjlu(a)umn.edu>
Cc: Shannon Nelson <shannon.lee.nelson(a)gmail.com>
Cc: David S. Miller <davem(a)davemloft.net>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Fixes: 7930742d ('Revert "niu: fix missing checks of niu_pci_eeprom_read"')
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Paul Jakma <paul(a)jakma.org>
---
--- e6e337708c22f80824b82d4af645f20715730ad0/drivers/net/ethernet/sun/niu.c 2021-07-20 20:51:52.054770659 +0100
+++ fix/drivers/net/ethernet/sun/niu.c 2021-07-20 20:49:02.194870695 +0100
@@ -8192,7 +8192,7 @@
if (err < 0)
return err;
if (err == 1)
- return -EINVAL;
+ return 0;
}
return 0;
}
--
Paul Jakma | paul(a)jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
How sharper than a serpent's tooth is a sister's "See?"
-- Linus Van Pelt
This is a note to let you know that I've just added the patch titled
driver core: auxiliary bus: Fix memory leak when driver_register()
to my driver-core git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git
in the driver-core-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 4afa0c22eed33cfe0c590742387f0d16f32412f3 Mon Sep 17 00:00:00 2001
From: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Date: Tue, 13 Jul 2021 12:34:38 +0300
Subject: driver core: auxiliary bus: Fix memory leak when driver_register()
fail
If driver_register() returns with error we need to free the memory
allocated for auxdrv->driver.name before returning from
__auxiliary_driver_register()
Fixes: 7de3697e9cbd4 ("Add auxiliary bus support")
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210713093438.3173-1-peter.ujfalusi@linux.intel.…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/base/auxiliary.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/base/auxiliary.c b/drivers/base/auxiliary.c
index adc199dfba3c..6a30264ab2ba 100644
--- a/drivers/base/auxiliary.c
+++ b/drivers/base/auxiliary.c
@@ -231,6 +231,8 @@ EXPORT_SYMBOL_GPL(auxiliary_find_device);
int __auxiliary_driver_register(struct auxiliary_driver *auxdrv,
struct module *owner, const char *modname)
{
+ int ret;
+
if (WARN_ON(!auxdrv->probe) || WARN_ON(!auxdrv->id_table))
return -EINVAL;
@@ -246,7 +248,11 @@ int __auxiliary_driver_register(struct auxiliary_driver *auxdrv,
auxdrv->driver.bus = &auxiliary_bus_type;
auxdrv->driver.mod_name = modname;
- return driver_register(&auxdrv->driver);
+ ret = driver_register(&auxdrv->driver);
+ if (ret)
+ kfree(auxdrv->driver.name);
+
+ return ret;
}
EXPORT_SYMBOL_GPL(__auxiliary_driver_register);
--
2.32.0
From: Joerg Roedel <jroedel(a)suse.de>
For now, kexec is not supported when running as an SEV-ES guest. Doing
so requires additional hypervisor support and special code to hand
over the CPUs to the new kernel in a safe way.
Until this is implemented, do not support kexec in SEV-ES guests.
Cc: stable(a)vger.kernel.org # v5.10+
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
---
arch/x86/kernel/machine_kexec_64.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 131f30fdcfbd..a8e16a411b40 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -591,3 +591,11 @@ void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
*/
set_memory_encrypted((unsigned long)vaddr, pages);
}
+
+/*
+ * Kexec is not supported in SEV-ES guests yet
+ */
+bool arch_kexec_supported(void)
+{
+ return !sev_es_active();
+}
--
2.31.1
This is a note to let you know that I've just added the patch titled
nds32: fix up stack guard gap
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From c453db6cd96418c79702eaf38259002755ab23ff Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Tue, 29 Jun 2021 12:40:24 +0200
Subject: nds32: fix up stack guard gap
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") fixed
up all architectures to deal with the stack guard gap. But when nds32
was added to the tree, it forgot to do the same thing.
Resolve this by properly fixing up the nsd32's version of
arch_get_unmapped_area()
Cc: Nick Hu <nickhu(a)andestech.com>
Cc: Greentime Hu <green.hu(a)gmail.com>
Cc: Vincent Chen <deanbo422(a)gmail.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Qiang Liu <cyruscyliu(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Reported-by: iLifetruth <yixiaonn(a)gmail.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Link: https://lore.kernel.org/r/20210629104024.2293615-1-gregkh@linuxfoundation.o…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/nds32/mm/mmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/nds32/mm/mmap.c b/arch/nds32/mm/mmap.c
index c206b31ce07a..1bdf5e7d1b43 100644
--- a/arch/nds32/mm/mmap.c
+++ b/arch/nds32/mm/mmap.c
@@ -59,7 +59,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
vma = find_vma(mm, addr);
if (TASK_SIZE - len >= addr &&
- (!vma || addr + len <= vma->vm_start))
+ (!vma || addr + len <= vm_start_gap(vma)))
return addr;
}
--
2.32.0
This is a note to let you know that I've just added the patch titled
serial: 8250_pci: Enumerate Elkhart Lake UARTs via dedicated driver
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 7f0909db761535aefafa77031062603a71557267 Mon Sep 17 00:00:00 2001
From: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Date: Tue, 13 Jul 2021 13:17:39 +0300
Subject: serial: 8250_pci: Enumerate Elkhart Lake UARTs via dedicated driver
Elkhart Lake UARTs are PCI enumerated Synopsys DesignWare v4.0+ UART
integrated with Intel iDMA 32-bit DMA controller. There is a specific
driver to handle them, i.e. 8250_lpss. Hence, disable 8250_pci
enumeration for these UARTs.
Fixes: 1b91d97c66ef ("serial: 8250_lpss: Add ->setup() for Elkhart Lake ports")
Fixes: 4f912b898dc2 ("serial: 8250_lpss: Enable HS UART on Elkhart Lake")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210713101739.36962-1-andriy.shevchenko@linux.in…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_pci.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
index 75827b608fdb..02985cf90ef2 100644
--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -3836,6 +3836,12 @@ static const struct pci_device_id blacklist[] = {
{ PCI_VDEVICE(INTEL, 0x0f0c), },
{ PCI_VDEVICE(INTEL, 0x228a), },
{ PCI_VDEVICE(INTEL, 0x228c), },
+ { PCI_VDEVICE(INTEL, 0x4b96), },
+ { PCI_VDEVICE(INTEL, 0x4b97), },
+ { PCI_VDEVICE(INTEL, 0x4b98), },
+ { PCI_VDEVICE(INTEL, 0x4b99), },
+ { PCI_VDEVICE(INTEL, 0x4b9a), },
+ { PCI_VDEVICE(INTEL, 0x4b9b), },
{ PCI_VDEVICE(INTEL, 0x9ce3), },
{ PCI_VDEVICE(INTEL, 0x9ce4), },
--
2.32.0
This is a note to let you know that I've just added the patch titled
serial: tegra: Only print FIFO error message when an error occurs
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From cc9ca4d95846cbbece48d9cd385550f8fba6a3c1 Mon Sep 17 00:00:00 2001
From: Jon Hunter <jonathanh(a)nvidia.com>
Date: Wed, 30 Jun 2021 13:56:43 +0100
Subject: serial: tegra: Only print FIFO error message when an error occurs
The Tegra serial driver always prints an error message when enabling the
FIFO for devices that have support for checking the FIFO enable status.
Fix this by displaying the error message, only when an error occurs.
Finally, update the error message to make it clear that enabling the
FIFO failed and display the error code.
Fixes: 222dcdff3405 ("serial: tegra: check for FIFO mode enabled status")
Cc: <stable(a)vger.kernel.org>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Signed-off-by: Jon Hunter <jonathanh(a)nvidia.com>
Link: https://lore.kernel.org/r/20210630125643.264264-1-jonathanh@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial-tegra.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/serial-tegra.c b/drivers/tty/serial/serial-tegra.c
index 222032792d6c..eba5b9ecba34 100644
--- a/drivers/tty/serial/serial-tegra.c
+++ b/drivers/tty/serial/serial-tegra.c
@@ -1045,9 +1045,11 @@ static int tegra_uart_hw_init(struct tegra_uart_port *tup)
if (tup->cdata->fifo_mode_enable_status) {
ret = tegra_uart_wait_fifo_mode_enabled(tup);
- dev_err(tup->uport.dev, "FIFO mode not enabled\n");
- if (ret < 0)
+ if (ret < 0) {
+ dev_err(tup->uport.dev,
+ "Failed to enable FIFO mode: %d\n", ret);
return ret;
+ }
} else {
/*
* For all tegra devices (up to t210), there is a hardware
--
2.32.0
This is a note to let you know that I've just added the patch titled
tty: Fix out-of-bound vmalloc access in imageblit
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 3b0c406124719b625b1aba431659f5cdc24a982c Mon Sep 17 00:00:00 2001
From: Igor Matheus Andrade Torrente <igormtorrente(a)gmail.com>
Date: Mon, 28 Jun 2021 10:45:09 -0300
Subject: tty: Fix out-of-bound vmalloc access in imageblit
This issue happens when a userspace program does an ioctl
FBIOPUT_VSCREENINFO passing the fb_var_screeninfo struct
containing only the fields xres, yres, and bits_per_pixel
with values.
If this struct is the same as the previous ioctl, the
vc_resize() detects it and doesn't call the resize_screen(),
leaving the fb_var_screeninfo incomplete. And this leads to
the updatescrollmode() calculates a wrong value to
fbcon_display->vrows, which makes the real_y() return a
wrong value of y, and that value, eventually, causes
the imageblit to access an out-of-bound address value.
To solve this issue I made the resize_screen() be called
even if the screen does not need any resizing, so it will
"fix and fill" the fb_var_screeninfo independently.
Cc: stable <stable(a)vger.kernel.org> # after 5.15-rc2 is out, give it time to bake
Reported-and-tested-by: syzbot+858dc7a2f7ef07c2c219(a)syzkaller.appspotmail.com
Signed-off-by: Igor Matheus Andrade Torrente <igormtorrente(a)gmail.com>
Link: https://lore.kernel.org/r/20210628134509.15895-1-igormtorrente@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index ef981d3b7bb4..484eda1958f5 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1219,8 +1219,25 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
new_row_size = new_cols << 1;
new_screen_size = new_row_size * new_rows;
- if (new_cols == vc->vc_cols && new_rows == vc->vc_rows)
- return 0;
+ if (new_cols == vc->vc_cols && new_rows == vc->vc_rows) {
+ /*
+ * This function is being called here to cover the case
+ * where the userspace calls the FBIOPUT_VSCREENINFO twice,
+ * passing the same fb_var_screeninfo containing the fields
+ * yres/xres equal to a number non-multiple of vc_font.height
+ * and yres_virtual/xres_virtual equal to number lesser than the
+ * vc_font.height and yres/xres.
+ * In the second call, the struct fb_var_screeninfo isn't
+ * being modified by the underlying driver because of the
+ * if above, and this causes the fbcon_display->vrows to become
+ * negative and it eventually leads to out-of-bound
+ * access by the imageblit function.
+ * To give the correct values to the struct and to not have
+ * to deal with possible errors from the code below, we call
+ * the resize_screen here as well.
+ */
+ return resize_screen(vc, new_cols, new_rows, user);
+ }
if (new_screen_size > KMALLOC_MAX_SIZE || !new_screen_size)
return -EINVAL;
--
2.32.0
This is a note to let you know that I've just added the patch titled
staging: rtl8723bs: Fix a resource leak in sd_int_dpc
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 990e4ad3ddcb72216caeddd6e62c5f45a21e8121 Mon Sep 17 00:00:00 2001
From: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Date: Mon, 28 Jun 2021 23:22:39 +0800
Subject: staging: rtl8723bs: Fix a resource leak in sd_int_dpc
The "c2h_evt" variable is not freed when function call
"c2h_evt_read_88xx" failed
Fixes: 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver")
Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Xiangyang Zhang <xyz.sun.ok(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20210628152239.5475-1-xyz.sun.ok@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/rtl8723bs/hal/sdio_ops.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/staging/rtl8723bs/hal/sdio_ops.c b/drivers/staging/rtl8723bs/hal/sdio_ops.c
index 2dd251ce177e..a545832a468e 100644
--- a/drivers/staging/rtl8723bs/hal/sdio_ops.c
+++ b/drivers/staging/rtl8723bs/hal/sdio_ops.c
@@ -909,6 +909,8 @@ void sd_int_dpc(struct adapter *adapter)
} else {
rtw_c2h_wk_cmd(adapter, (u8 *)c2h_evt);
}
+ } else {
+ kfree(c2h_evt);
}
} else {
/* Error handling for malloc fail */
--
2.32.0
This is a note to let you know that I've just added the patch titled
usb: dwc2: gadget: Fix sending zero length packet in DDMA mode.
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From d53dc38857f6dbefabd9eecfcbf67b6eac9a1ef4 Mon Sep 17 00:00:00 2001
From: Minas Harutyunyan <Minas.Harutyunyan(a)synopsys.com>
Date: Tue, 20 Jul 2021 05:41:24 -0700
Subject: usb: dwc2: gadget: Fix sending zero length packet in DDMA mode.
Sending zero length packet in DDMA mode perform by DMA descriptor
by setting SP (short packet) flag.
For DDMA in function dwc2_hsotg_complete_in() does not need to send
zlp.
Tested by USBCV MSC tests.
Fixes: f71b5e2533de ("usb: dwc2: gadget: fix zero length packet transfers")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan(a)synopsys.com>
Link: https://lore.kernel.org/r/967bad78c55dd2db1c19714eee3d0a17cf99d74a.16267777…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc2/gadget.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index 74d25019272f..3146df6e6510 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -2749,12 +2749,14 @@ static void dwc2_hsotg_complete_in(struct dwc2_hsotg *hsotg,
return;
}
- /* Zlp for all endpoints, for ep0 only in DATA IN stage */
+ /* Zlp for all endpoints in non DDMA, for ep0 only in DATA IN stage */
if (hs_ep->send_zlp) {
- dwc2_hsotg_program_zlp(hsotg, hs_ep);
hs_ep->send_zlp = 0;
- /* transfer will be completed on next complete interrupt */
- return;
+ if (!using_desc_dma(hsotg)) {
+ dwc2_hsotg_program_zlp(hsotg, hs_ep);
+ /* transfer will be completed on next complete interrupt */
+ return;
+ }
}
if (hs_ep->index == 0 && hsotg->ep0_state == DWC2_EP0_DATA_IN) {
--
2.32.0
This is a note to let you know that I've just added the patch titled
usb: dwc2: Skip clock gating on Samsung SoCs
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From c4a0f7a6ab5417eb6105b0e1d7e6e67f6ef7d4e5 Mon Sep 17 00:00:00 2001
From: Marek Szyprowski <m.szyprowski(a)samsung.com>
Date: Fri, 16 Jul 2021 07:01:27 +0200
Subject: usb: dwc2: Skip clock gating on Samsung SoCs
Commit 0112b7ce68ea ("usb: dwc2: Update dwc2_handle_usb_suspend_intr
function.") changed the way the driver handles power down modes in a such
way that it uses clock gating when no other power down mode is available.
This however doesn't work well on the DWC2 implementation used on the
Samsung SoCs. When a clock gating is enabled, system hangs. It looks that
the proper clock gating requires some additional glue code in the shared
USB2 PHY and/or Samsung glue code for the DWC2. To restore driver
operation on the Samsung SoCs simply skip enabling clock gating mode
until one finds what is really needed to make it working reliably.
Fixes: 0112b7ce68ea ("usb: dwc2: Update dwc2_handle_usb_suspend_intr function.")
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)canonical.com>
Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com>
Link: https://lore.kernel.org/r/20210716050127.4406-1-m.szyprowski@samsung.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc2/core.h | 4 ++++
drivers/usb/dwc2/core_intr.c | 3 ++-
drivers/usb/dwc2/hcd.c | 6 ++++--
drivers/usb/dwc2/params.c | 1 +
4 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index ab6b815e0089..483de2bbfaab 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -383,6 +383,9 @@ enum dwc2_ep0_state {
* 0 - No (default)
* 1 - Partial power down
* 2 - Hibernation
+ * @no_clock_gating: Specifies whether to avoid clock gating feature.
+ * 0 - No (use clock gating)
+ * 1 - Yes (avoid it)
* @lpm: Enable LPM support.
* 0 - No
* 1 - Yes
@@ -480,6 +483,7 @@ struct dwc2_core_params {
#define DWC2_POWER_DOWN_PARAM_NONE 0
#define DWC2_POWER_DOWN_PARAM_PARTIAL 1
#define DWC2_POWER_DOWN_PARAM_HIBERNATION 2
+ bool no_clock_gating;
bool lpm;
bool lpm_clock_gating;
diff --git a/drivers/usb/dwc2/core_intr.c b/drivers/usb/dwc2/core_intr.c
index a5ab03808da6..a5c52b237e72 100644
--- a/drivers/usb/dwc2/core_intr.c
+++ b/drivers/usb/dwc2/core_intr.c
@@ -556,7 +556,8 @@ static void dwc2_handle_usb_suspend_intr(struct dwc2_hsotg *hsotg)
* If neither hibernation nor partial power down are supported,
* clock gating is used to save power.
*/
- dwc2_gadget_enter_clock_gating(hsotg);
+ if (!hsotg->params.no_clock_gating)
+ dwc2_gadget_enter_clock_gating(hsotg);
}
/*
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 035d4911a3c3..2a7828971d05 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -3338,7 +3338,8 @@ int dwc2_port_suspend(struct dwc2_hsotg *hsotg, u16 windex)
* If not hibernation nor partial power down are supported,
* clock gating is used to save power.
*/
- dwc2_host_enter_clock_gating(hsotg);
+ if (!hsotg->params.no_clock_gating)
+ dwc2_host_enter_clock_gating(hsotg);
break;
}
@@ -4402,7 +4403,8 @@ static int _dwc2_hcd_suspend(struct usb_hcd *hcd)
* If not hibernation nor partial power down are supported,
* clock gating is used to save power.
*/
- dwc2_host_enter_clock_gating(hsotg);
+ if (!hsotg->params.no_clock_gating)
+ dwc2_host_enter_clock_gating(hsotg);
/* After entering suspend, hardware is not accessible */
clear_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
diff --git a/drivers/usb/dwc2/params.c b/drivers/usb/dwc2/params.c
index 67c5eb140232..59e119345994 100644
--- a/drivers/usb/dwc2/params.c
+++ b/drivers/usb/dwc2/params.c
@@ -76,6 +76,7 @@ static void dwc2_set_s3c6400_params(struct dwc2_hsotg *hsotg)
struct dwc2_core_params *p = &hsotg->params;
p->power_down = DWC2_POWER_DOWN_PARAM_NONE;
+ p->no_clock_gating = true;
p->phy_utmi_width = 8;
}
--
2.32.0
This is a note to let you know that I've just added the patch titled
usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop()
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5719df243e118fb343725e8b2afb1637e1af1373 Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Thu, 24 Jun 2021 21:20:39 +0900
Subject: usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop()
This driver has a potential issue which this driver is possible to
cause superfluous irqs after usb_pkt_pop() is called. So, after
the commit 3af32605289e ("usb: renesas_usbhs: fix error return
code of usbhsf_pkt_handler()") had been applied, we could observe
the following error happened when we used g_audio.
renesas_usbhs e6590000.usb: irq_ready run_error 1 : -22
To fix the issue, disable the tx or rx interrupt in usb_pkt_pop().
Fixes: 2743e7f90dc0 ("usb: renesas_usbhs: fix the usb_pkt_pop()")
Cc: <stable(a)vger.kernel.org> # v4.4+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Link: https://lore.kernel.org/r/20210624122039.596528-1-yoshihiro.shimoda.uh@rene…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/renesas_usbhs/fifo.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/usb/renesas_usbhs/fifo.c b/drivers/usb/renesas_usbhs/fifo.c
index b5e7991dc7d9..a3c2b01ccf7b 100644
--- a/drivers/usb/renesas_usbhs/fifo.c
+++ b/drivers/usb/renesas_usbhs/fifo.c
@@ -101,6 +101,8 @@ static struct dma_chan *usbhsf_dma_chan_get(struct usbhs_fifo *fifo,
#define usbhsf_dma_map(p) __usbhsf_dma_map_ctrl(p, 1)
#define usbhsf_dma_unmap(p) __usbhsf_dma_map_ctrl(p, 0)
static int __usbhsf_dma_map_ctrl(struct usbhs_pkt *pkt, int map);
+static void usbhsf_tx_irq_ctrl(struct usbhs_pipe *pipe, int enable);
+static void usbhsf_rx_irq_ctrl(struct usbhs_pipe *pipe, int enable);
struct usbhs_pkt *usbhs_pkt_pop(struct usbhs_pipe *pipe, struct usbhs_pkt *pkt)
{
struct usbhs_priv *priv = usbhs_pipe_to_priv(pipe);
@@ -123,6 +125,11 @@ struct usbhs_pkt *usbhs_pkt_pop(struct usbhs_pipe *pipe, struct usbhs_pkt *pkt)
if (chan) {
dmaengine_terminate_all(chan);
usbhsf_dma_unmap(pkt);
+ } else {
+ if (usbhs_pipe_is_dir_in(pipe))
+ usbhsf_rx_irq_ctrl(pipe, 0);
+ else
+ usbhsf_tx_irq_ctrl(pipe, 0);
}
usbhs_pipe_clear_without_sequence(pipe, 0, 0);
--
2.32.0
This is a note to let you know that I've just added the patch titled
usb: dwc2: gadget: Fix GOUTNAK flow for Slave mode.
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From fecb3a171db425e5068b27231f8efe154bf72637 Mon Sep 17 00:00:00 2001
From: Minas Harutyunyan <Minas.Harutyunyan(a)synopsys.com>
Date: Tue, 13 Jul 2021 09:32:55 +0400
Subject: usb: dwc2: gadget: Fix GOUTNAK flow for Slave mode.
Because of dwc2_hsotg_ep_stop_xfr() function uses poll
mode, first need to mask GINTSTS_GOUTNAKEFF interrupt.
In Slave mode GINTSTS_GOUTNAKEFF interrupt will be
aserted only after pop OUT NAK status packet from RxFIFO.
In dwc2_hsotg_ep_sethalt() function before setting
DCTL_SGOUTNAK need to unmask GOUTNAKEFF interrupt.
Tested by USBCV CH9 and MSC tests set in Slave, BDMA and DDMA.
All tests are passed.
Fixes: a4f827714539a ("usb: dwc2: gadget: Disable enabled HW endpoint in dwc2_hsotg_ep_disable")
Fixes: 6070636c4918c ("usb: dwc2: Fix Stalling a Non-Isochronous OUT EP")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan(a)synopsys.com>
Link: https://lore.kernel.org/r/e17fad802bbcaf879e1ed6745030993abb93baf8.16261529…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/dwc2/gadget.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index c581ee41ac81..74d25019272f 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -3900,9 +3900,27 @@ static void dwc2_hsotg_ep_stop_xfr(struct dwc2_hsotg *hsotg,
__func__);
}
} else {
+ /* Mask GINTSTS_GOUTNAKEFF interrupt */
+ dwc2_hsotg_disable_gsint(hsotg, GINTSTS_GOUTNAKEFF);
+
if (!(dwc2_readl(hsotg, GINTSTS) & GINTSTS_GOUTNAKEFF))
dwc2_set_bit(hsotg, DCTL, DCTL_SGOUTNAK);
+ if (!using_dma(hsotg)) {
+ /* Wait for GINTSTS_RXFLVL interrupt */
+ if (dwc2_hsotg_wait_bit_set(hsotg, GINTSTS,
+ GINTSTS_RXFLVL, 100)) {
+ dev_warn(hsotg->dev, "%s: timeout GINTSTS.RXFLVL\n",
+ __func__);
+ } else {
+ /*
+ * Pop GLOBAL OUT NAK status packet from RxFIFO
+ * to assert GOUTNAKEFF interrupt
+ */
+ dwc2_readl(hsotg, GRXSTSP);
+ }
+ }
+
/* Wait for global nak to take effect */
if (dwc2_hsotg_wait_bit_set(hsotg, GINTSTS,
GINTSTS_GOUTNAKEFF, 100))
@@ -4348,6 +4366,9 @@ static int dwc2_hsotg_ep_sethalt(struct usb_ep *ep, int value, bool now)
epctl = dwc2_readl(hs, epreg);
if (value) {
+ /* Unmask GOUTNAKEFF interrupt */
+ dwc2_hsotg_en_gsint(hs, GINTSTS_GOUTNAKEFF);
+
if (!(dwc2_readl(hs, GINTSTS) & GINTSTS_GOUTNAKEFF))
dwc2_set_bit(hs, DCTL, DCTL_SGOUTNAK);
// STALL bit will be set in GOUTNAKEFF interrupt handler
--
2.32.0
This is a note to let you know that I've just added the patch titled
usb: xhci: avoid renesas_usb_fw.mem when it's unusable
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0665e387318607d8269bfdea60723c627c8bae43 Mon Sep 17 00:00:00 2001
From: Greg Thelen <gthelen(a)google.com>
Date: Fri, 2 Jul 2021 00:12:24 -0700
Subject: usb: xhci: avoid renesas_usb_fw.mem when it's unusable
Commit a66d21d7dba8 ("usb: xhci: Add support for Renesas controller with
memory") added renesas_usb_fw.mem firmware reference to xhci-pci. Thus
modinfo indicates xhci-pci.ko has "firmware: renesas_usb_fw.mem". But
the firmware is only actually used with CONFIG_USB_XHCI_PCI_RENESAS. An
unusable firmware reference can trigger safety checkers which look for
drivers with unmet firmware dependencies.
Avoid referring to renesas_usb_fw.mem in circumstances when it cannot be
loaded (when CONFIG_USB_XHCI_PCI_RENESAS isn't set).
Fixes: a66d21d7dba8 ("usb: xhci: Add support for Renesas controller with memory")
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Thelen <gthelen(a)google.com>
Link: https://lore.kernel.org/r/20210702071224.3673568-1-gthelen@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-pci.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 18c2bbddf080..1c9a7957c45c 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -636,7 +636,14 @@ static const struct pci_device_id pci_ids[] = {
{ /* end: all zeroes */ }
};
MODULE_DEVICE_TABLE(pci, pci_ids);
+
+/*
+ * Without CONFIG_USB_XHCI_PCI_RENESAS renesas_xhci_check_request_fw() won't
+ * load firmware, so don't encumber the xhci-pci driver with it.
+ */
+#if IS_ENABLED(CONFIG_USB_XHCI_PCI_RENESAS)
MODULE_FIRMWARE("renesas_usb_fw.mem");
+#endif
/* pci driver glue; this is a "new style" PCI driver module */
static struct pci_driver xhci_pci_driver = {
--
2.32.0