I'm announcing the release of the 5.4.123 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
drivers/usb/dwc3/gadget.c | 4 +++
include/net/nfc/nci_core.h | 1
kernel/bpf/verifier.c | 46 ++++++++++++++++++++++++-----------------
net/nfc/nci/core.c | 1
net/nfc/nci/hci.c | 5 ++++
tools/perf/util/unwind-libdw.c | 35 +++++++++++++++++++++++++++----
7 files changed, 71 insertions(+), 23 deletions(-)
Daniel Borkmann (3):
bpf: Wrap aux data inside bpf_sanitize_info container
bpf: Fix mask direction swap upon off reg sign change
bpf: No need to simulate speculative domain for immediates
Dave Rigby (1):
perf unwind: Set userdata for all __report_module() paths
Dongliang Mu (1):
NFC: nci: fix memory leak in nci_allocate_device
Greg Kroah-Hartman (1):
Linux 5.4.123
Jack Pham (1):
usb: dwc3: gadget: Enable suspend events
Jan Kratochvil (1):
perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder
This patchset is based on Frank van der Linden's backport of CVE-2021-29155
fixes to 5.4 and 4.14:
https://lore.kernel.org/stable/20210429220839.15667-1-fllinden@amazon.com/https://lore.kernel.org/stable/20210501043014.33300-1-fllinden@amazon.com/
With this series, all verifier selftests but one (that has already been
failing, see [1] for more details) succeed.
What the series does is:
* Fix verifier selftests by backporting various bpf/selftest upstream commits +
add two 4.19 specific fixes
* Backport fixes for CVE-2021-29155 from 5.4 stable, including selftest
changes. Only minor context adjustements were made for 4.19 backport.
The following commits that fix selftests are 4.19 specific:
Ovidiu Panait (2):
1. bpf: fix up selftests after backports were fixed
This is the 4.19 equivalent of
https://lore.kernel.org/stable/20210501043014.33300-3-fllinden@amazon.com/
Basically a backport of upstream commit 80c9b2fae87b ("bpf: add various
test cases to selftests") adapted to 4.19 in order to fix the
selftests that began to fail after CVE-2019-7308 fixes.
2. selftests/bpf: add selftest part of "bpf: improve verifier branch
analysis"
This is a cherry-pick of the selftest parts that have been left out when
backporting 4f7b3e82589e0 ("bpf: improve verifier branch analysis") to 4.19.
[1] Note:
There is one verifier selftest that still fails:
...
#640/p bpf_get_stack return R0 within range FAIL
Failed to load prog 'Invalid argument'!
0: (bf) r6 = r1
1: (7a) *(u64 *)(r10 -8) = 0
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0xffff89a8f5503000
6: (85) call bpf_map_lookup_elem#1
7: (15) if r0 == 0x0 goto pc+28
R0=map_value(id=0,off=0,ks=8,vs=48,imm=0) R6=ctx(id=0,off=0,imm=0) R10=fp0,call_-1
8: (bf) r7 = r0
9: (b7) r9 = 48
10: (bf) r1 = r6
11: (bf) r2 = r7
12: (b7) r3 = 48
13: (b7) r4 = 256
14: (85) call bpf_get_stack#67
R0=map_value(id=0,off=0,ks=8,vs=48,imm=0) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=8,vs=48,imm=0) R3_w=inv48 R4_w=inv256 R6=ctx(id=0,off=0,imm=0) R7_w=map_value(id=0,off=0,ks=8,vs=48,imm=0) R9_w=inv48 R10=fp0,call_-1
15: (b7) r1 = 0
16: (bf) r8 = r0
17: (67) r8 <<= 32
18: (c7) r8 s>>= 32
19: (cd) if r1 s< r8 goto pc+16
R0=inv(id=0,umax_value=48,var_off=(0x0; 0x3f)) R1=inv0 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=8,vs=48,imm=0) R8=inv0 R9=inv48 R10=fp0,call_-1
20: (1f) r9 -= r8
21: (bf) r2 = r7
22: (0f) r2 += r8
23: (bf) r1 = r9
24: (67) r1 <<= 32
25: (c7) r1 s>>= 32
26: (bf) r3 = r2
27: (0f) r3 += r1
28: (bf) r1 = r7
29: (b7) r5 = 48
30: (0f) r1 += r5
31: (3d) if r3 >= r1 goto pc+4
R0=inv(id=0,umax_value=48,var_off=(0x0; 0x3f)) R1=map_value(id=0,off=48,ks=8,vs=48,imm=0) R2=map_value(id=0,off=0,ks=8,vs=48,imm=0) R3=map_value(id=0,off=48,ks=8,vs=48,imm=0) R5=inv48 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=8,vs=48,imm=0) R8=inv0 R9=inv48 R10=fp0,call_-1
32: (bf) r1 = r6
33: (bf) r3 = r9
34: (b7) r4 = 0
35: (85) call bpf_get_stack#67
R0=inv(id=0,umax_value=48,var_off=(0x0; 0x3f)) R1_w=ctx(id=0,off=0,imm=0) R2=map_value(id=0,off=0,ks=8,vs=48,imm=0) R3_w=inv48 R4_w=inv0 R5=inv48 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=8,vs=48,imm=0) R8=inv0 R9=inv48 R10=fp0,call_-1
36: (95) exit
from 35 to 36: R0=inv(id=0,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=8,vs=48,imm=0) R8=inv0 R9=inv48 R10=fp0,call_-1
36: (95) exit
from 31 to 36: safe
from 19 to 36: safe
from 14 to 15: R0=inv(id=0,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=8,vs=48,imm=0) R9=inv48 R10=fp0,call_-1
15: (b7) r1 = 0
16: (bf) r8 = r0
17: (67) r8 <<= 32
18: (c7) r8 s>>= 32
19: (cd) if r1 s< r8 goto pc+16
R0=inv(id=0,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) R1=inv0 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=8,vs=48,imm=0) R8=inv(id=0,umin_value=18446744071562067968,var_off=(0xffffffff80000000; 0x7fffffff)) R9=inv48 R10=fp0,call_-1
20: (1f) r9 -= r8
21: (bf) r2 = r7
22: (0f) r2 += r8
value -2147483648 makes map_value pointer be out of bounds
This failure was introduced after the following 4.19 fix:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
In 5.4 it was fixed by the following commits, but backporting them to 4.19 is
not enough to fix the failing test:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
After bisect, the following upstream commit needs to be present as well in
order for the selftest to pass, but I am not sure if it is suitable for stable
backport:
https://github.com/torvalds/linux/commit/2589726d12a1b12eaaa93c7f1ea64287e3…
Andrey Ignatov (1):
selftests/bpf: Test narrow loads with off > 0 in test_verifier
Daniel Borkmann (8):
bpf: Move off_reg into sanitize_ptr_alu
bpf: Ensure off_reg has no mixed signed bounds for all types
bpf: Rework ptr_limit into alu_limit and add common error path
bpf: Improve verifier error messages for users
bpf: Refactor and streamline bounds check into helper
bpf: Move sanitize_val_alu out of op switch
bpf: Tighten speculative pointer arithmetic mask
bpf: Update selftests to reflect new error states
Ovidiu Panait (2):
bpf: fix up selftests after backports were fixed
selftests/bpf: add selftest part of "bpf: improve verifier branch
analysis"
Piotr Krysiuk (1):
bpf, selftests: Fix up some test_verifier cases for unprivileged
kernel/bpf/verifier.c | 229 ++++++++++++++------
tools/testing/selftests/bpf/test_verifier.c | 104 +++++++--
2 files changed, 241 insertions(+), 92 deletions(-)
--
2.17.1
Dear Greg, Dear Sasha,
please consider applying the patch below to v4.4-stable.
It is a backport of upstream commit 794aaf01444d, which has already
been applied to all stable kernels going back to v4.14.
Thanks!
Lukas
-- >8 --
>From 429a36a750568599640ae8b9e603d639181fee9a Mon Sep 17 00:00:00 2001
From: "William A. Kennington III" <wak(a)google.com>
Date: Wed, 7 Apr 2021 02:55:27 -0700
Subject: [PATCH] spi: Fix use-after-free with devm_spi_alloc_*
commit 794aaf01444d4e765e2b067cba01cc69c1c68ed9 upstream.
We can't rely on the contents of the devres list during
spi_unregister_controller(), as the list is already torn down at the
time we perform devres_find() for devm_spi_release_controller. This
causes devices registered with devm_spi_alloc_{master,slave}() to be
mistakenly identified as legacy, non-devm managed devices and have their
reference counters decremented below 0.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 660 at lib/refcount.c:28 refcount_warn_saturate+0x108/0x174
[<b0396f04>] (refcount_warn_saturate) from [<b03c56a4>] (kobject_put+0x90/0x98)
[<b03c5614>] (kobject_put) from [<b0447b4c>] (put_device+0x20/0x24)
r4:b6700140
[<b0447b2c>] (put_device) from [<b07515e8>] (devm_spi_release_controller+0x3c/0x40)
[<b07515ac>] (devm_spi_release_controller) from [<b045343c>] (release_nodes+0x84/0xc4)
r5:b6700180 r4:b6700100
[<b04533b8>] (release_nodes) from [<b0454160>] (devres_release_all+0x5c/0x60)
r8:b1638c54 r7:b117ad94 r6:b1638c10 r5:b117ad94 r4:b163dc10
[<b0454104>] (devres_release_all) from [<b044e41c>] (__device_release_driver+0x144/0x1ec)
r5:b117ad94 r4:b163dc10
[<b044e2d8>] (__device_release_driver) from [<b044f70c>] (device_driver_detach+0x84/0xa0)
r9:00000000 r8:00000000 r7:b117ad94 r6:b163dc54 r5:b1638c10 r4:b163dc10
[<b044f688>] (device_driver_detach) from [<b044d274>] (unbind_store+0xe4/0xf8)
Instead, determine the devm allocation state as a flag on the
controller which is guaranteed to be stable during cleanup.
Fixes: 5e844cc37a5c ("spi: Introduce device-managed SPI controller allocation")
Signed-off-by: William A. Kennington III <wak(a)google.com>
Link: https://lore.kernel.org/r/20210407095527.2771582-1-wak@google.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
[lukas: backport to v4.4.270]
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
---
drivers/spi/spi.c | 9 ++-------
include/linux/spi/spi.h | 3 +++
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index e85feee750e3..f743b95d5171 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1762,6 +1762,7 @@ struct spi_master *devm_spi_alloc_master(struct device *dev, unsigned int size)
master = spi_alloc_master(dev, size);
if (master) {
+ master->devm_allocated = true;
*ptr = master;
devres_add(dev, ptr);
} else {
@@ -1951,11 +1952,6 @@ int devm_spi_register_master(struct device *dev, struct spi_master *master)
}
EXPORT_SYMBOL_GPL(devm_spi_register_master);
-static int devm_spi_match_master(struct device *dev, void *res, void *master)
-{
- return *(struct spi_master **)res == master;
-}
-
static int __unregister(struct device *dev, void *null)
{
spi_unregister_device(to_spi_device(dev));
@@ -1994,8 +1990,7 @@ void spi_unregister_master(struct spi_master *master)
/* Release the last reference on the master if its driver
* has not yet been converted to devm_spi_alloc_master().
*/
- if (!devres_find(master->dev.parent, devm_spi_release_master,
- devm_spi_match_master, master))
+ if (!master->devm_allocated)
put_device(&master->dev);
if (IS_ENABLED(CONFIG_SPI_DYNAMIC))
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index f5d387140c46..da487e905337 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -425,6 +425,9 @@ struct spi_master {
#define SPI_MASTER_MUST_RX BIT(3) /* requires rx */
#define SPI_MASTER_MUST_TX BIT(4) /* requires tx */
+ /* flag indicating this is a non-devres managed controller */
+ bool devm_allocated;
+
/* lock and mutex for SPI bus locking */
spinlock_t bus_lock_spinlock;
struct mutex bus_lock_mutex;
--
2.31.1
Dear Greg, Dear Sasha,
please consider applying the patch below to v4.9-stable.
It is a backport of upstream commit 794aaf01444d, which has already
been applied to all stable kernels going back to v4.14.
Thanks!
Lukas
-- >8 --
From: "William A. Kennington III" <wak(a)google.com>
Date: Wed, 7 Apr 2021 02:55:27 -0700
Subject: [PATCH] spi: Fix use-after-free with devm_spi_alloc_*
commit 794aaf01444d4e765e2b067cba01cc69c1c68ed9 upstream.
We can't rely on the contents of the devres list during
spi_unregister_controller(), as the list is already torn down at the
time we perform devres_find() for devm_spi_release_controller. This
causes devices registered with devm_spi_alloc_{master,slave}() to be
mistakenly identified as legacy, non-devm managed devices and have their
reference counters decremented below 0.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 660 at lib/refcount.c:28 refcount_warn_saturate+0x108/0x174
[<b0396f04>] (refcount_warn_saturate) from [<b03c56a4>] (kobject_put+0x90/0x98)
[<b03c5614>] (kobject_put) from [<b0447b4c>] (put_device+0x20/0x24)
r4:b6700140
[<b0447b2c>] (put_device) from [<b07515e8>] (devm_spi_release_controller+0x3c/0x40)
[<b07515ac>] (devm_spi_release_controller) from [<b045343c>] (release_nodes+0x84/0xc4)
r5:b6700180 r4:b6700100
[<b04533b8>] (release_nodes) from [<b0454160>] (devres_release_all+0x5c/0x60)
r8:b1638c54 r7:b117ad94 r6:b1638c10 r5:b117ad94 r4:b163dc10
[<b0454104>] (devres_release_all) from [<b044e41c>] (__device_release_driver+0x144/0x1ec)
r5:b117ad94 r4:b163dc10
[<b044e2d8>] (__device_release_driver) from [<b044f70c>] (device_driver_detach+0x84/0xa0)
r9:00000000 r8:00000000 r7:b117ad94 r6:b163dc54 r5:b1638c10 r4:b163dc10
[<b044f688>] (device_driver_detach) from [<b044d274>] (unbind_store+0xe4/0xf8)
Instead, determine the devm allocation state as a flag on the
controller which is guaranteed to be stable during cleanup.
Fixes: 5e844cc37a5c ("spi: Introduce device-managed SPI controller allocation")
Signed-off-by: William A. Kennington III <wak(a)google.com>
Link: https://lore.kernel.org/r/20210407095527.2771582-1-wak@google.com
Signed-off-by: Mark Brown <broonie(a)kernel.org>
[lukas: backport to v4.9.270]
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
---
drivers/spi/spi.c | 9 ++-------
include/linux/spi/spi.h | 3 +++
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index f0ba5eb26128..84e2296c45a2 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1869,6 +1869,7 @@ struct spi_master *devm_spi_alloc_master(struct device *dev, unsigned int size)
master = spi_alloc_master(dev, size);
if (master) {
+ master->devm_allocated = true;
*ptr = master;
devres_add(dev, ptr);
} else {
@@ -2059,11 +2060,6 @@ int devm_spi_register_master(struct device *dev, struct spi_master *master)
}
EXPORT_SYMBOL_GPL(devm_spi_register_master);
-static int devm_spi_match_master(struct device *dev, void *res, void *master)
-{
- return *(struct spi_master **)res == master;
-}
-
static int __unregister(struct device *dev, void *null)
{
spi_unregister_device(to_spi_device(dev));
@@ -2102,8 +2098,7 @@ void spi_unregister_master(struct spi_master *master)
/* Release the last reference on the master if its driver
* has not yet been converted to devm_spi_alloc_master().
*/
- if (!devres_find(master->dev.parent, devm_spi_release_master,
- devm_spi_match_master, master))
+ if (!master->devm_allocated)
put_device(&master->dev);
if (IS_ENABLED(CONFIG_SPI_DYNAMIC))
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 8470695e5dd7..9c8445f1af0c 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -443,6 +443,9 @@ struct spi_master {
#define SPI_MASTER_MUST_RX BIT(3) /* requires rx */
#define SPI_MASTER_MUST_TX BIT(4) /* requires tx */
+ /* flag indicating this is a non-devres managed controller */
+ bool devm_allocated;
+
/*
* on some hardware transfer / message size may be constrained
* the limit may depend on device transfer settings
--
2.31.1
There exists a possible scenario in which dwc3_gadget_init() can fail:
during during host -> peripheral mode switch in dwc3_set_mode(), and
a pending gadget driver fails to bind. Then, if the DRD undergoes
another mode switch from peripheral->host the resulting
dwc3_gadget_exit() will attempt to reference an invalid and dangling
dwc->gadget pointer as well as call dma_free_coherent() on unmapped
DMA pointers.
The exact scenario can be reproduced as follows:
- Start DWC3 in peripheral mode
- Configure ConfigFS gadget with FunctionFS instance (or use g_ffs)
- Run FunctionFS userspace application (open EPs, write descriptors, etc)
- Bind gadget driver to DWC3's UDC
- Switch DWC3 to host mode
=> dwc3_gadget_exit() is called. usb_del_gadget() will put the
ConfigFS driver instance on the gadget_driver_pending_list
- Stop FunctionFS application (closes the ep files)
- Switch DWC3 to peripheral mode
=> dwc3_gadget_init() fails as usb_add_gadget() calls
check_pending_gadget_drivers() and attempts to rebind the UDC
to the ConfigFS gadget but fails with -19 (-ENODEV) because the
FFS instance is not in FFS_READY state.
- Switch DWC3 back to host mode
=> dwc3_gadget_exit() is called again, but this time dwc->gadget
is invalid.
Although it can be argued that userspace should take responsibility
for ensuring that the FunctionFS application be ready prior to
allowing the composite driver bind to the UDC, failure to do so
should not result in a panic from the kernel driver.
Fix this by setting dwc->gadget to NULL in the failure path of
dwc3_gadget_init() and add a check to dwc3_gadget_exit() to bail out
unless the gadget pointer is valid.
Fixes: e81a7018d93a ("usb: dwc3: allocate gadget structure dynamically")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Jack Pham <jackp(a)codeaurora.org>
---
Hi Felipe,
Although I marked the 'Fixes' tag above to e81a7018d93a, the problem
theoretically exists prior to Peter's change. But I'm not sure how
best to fix on versions prior to this change since dwc->gadget used
to be an embedded struct so we can't do a simple NULL check as below.
Suggestions on alternative approaches welcome if we want to proceed
with backporting to older (pre-5.9) stable releases.
Thanks,
Jack
drivers/usb/dwc3/gadget.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 612825a39f82..65d9b7227752 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -4046,6 +4046,7 @@ int dwc3_gadget_init(struct dwc3 *dwc)
dwc3_gadget_free_endpoints(dwc);
err4:
usb_put_gadget(dwc->gadget);
+ dwc->gadget = NULL;
err3:
dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce,
dwc->bounce_addr);
@@ -4065,6 +4066,9 @@ int dwc3_gadget_init(struct dwc3 *dwc)
void dwc3_gadget_exit(struct dwc3 *dwc)
{
+ if (!dwc->gadget)
+ return;
+
usb_del_gadget(dwc->gadget);
dwc3_gadget_free_endpoints(dwc);
usb_put_gadget(dwc->gadget);
--
2.24.0
This patch series contains stability fixes and error handling for remoteproc.
The changes included in this series do the following:
Patch 1: Fixes the creation of the rproc character device.
Patch 2: Validates rproc as the first step of rproc_add().
Patch 3: Adds error handling in rproc_add().
Siddharth Gupta (3):
remoteproc: core: Move cdev add before device add
remoteproc: core: Move validate before device add
remoteproc: core: Cleanup device in case of failure
drivers/remoteproc/remoteproc_core.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The patch titled
Subject: kthread: fix kthread_mod_delayed_work vs kthread_cancel_delayed_work_sync race
has been removed from the -mm tree. Its filename was
kthread-fix-kthread_mod_delayed_work-vs-kthread_cancel_delayed_work_sync-race.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Martin Liu <liumartin(a)google.com>
Subject: kthread: fix kthread_mod_delayed_work vs kthread_cancel_delayed_work_sync race
We encountered a system hang issue while doing the tests. The callstack
is as following
schedule+0x80/0x100
schedule_timeout+0x48/0x138
wait_for_common+0xa4/0x134
wait_for_completion+0x1c/0x2c
kthread_flush_work+0x114/0x1cc
kthread_cancel_work_sync.llvm.16514401384283632983+0xe8/0x144
kthread_cancel_delayed_work_sync+0x18/0x2c
xxxx_pm_notify+0xb0/0xd8
blocking_notifier_call_chain_robust+0x80/0x194
pm_notifier_call_chain_robust+0x28/0x4c
suspend_prepare+0x40/0x260
enter_state+0x80/0x3f4
pm_suspend+0x60/0xdc
state_store+0x108/0x144
kobj_attr_store+0x38/0x88
sysfs_kf_write+0x64/0xc0
kernfs_fop_write_iter+0x108/0x1d0
vfs_write+0x2f4/0x368
ksys_write+0x7c/0xec
When we started investigating, we found race between
kthread_mod_delayed_work vs kthread_cancel_delayed_work_sync. The race's
result could be simply reproduced as a kthread_mod_delayed_work with a
following kthread_flush_work call.
Thing is we release kthread_mod_delayed_work kspin_lock in
__kthread_cancel_work so it opens a race window for
kthread_cancel_delayed_work_sync to change the canceling count used to
prevent dwork from being requeued before calling kthread_flush_work.
However, we don't check the canceling count after returning from
__kthread_cancel_work and then insert the dwork to the worker. It results
the following kthread_flush_work inserts flush work to dwork's tail which
is at worker's dealyed_work_list. Therefore, flush work will never get
moved to the worker's work_list to be executed. Finally,
kthread_cancel_delayed_work_sync will NOT be able to get completed and
wait forever. The code sequence diagram is as following
Thread A Thread B
kthread_mod_delayed_work
spin_lock
__kthread_cancel_work
canceling = 1
spin_unlock
kthread_cancel_delayed_work_sync
spin_lock
kthread_cancel_work
canceling = 2
spin_unlock
del_timer_sync
spin_lock
canceling = 1 // canceling count gets update in ThreadB before
queue_delayed_work // dwork is put into the woker's dealyed_work_list
without checking the canceling count
spin_unlock
kthread_flush_work
spin_lock
Insert flush work // at the tail of the
dwork which is at
the worker's
dealyed_work_list
spin_unlock
wait_for_completion // Thread B stuck here as
flush work will never
get executed
The canceling count could change in __kthread_cancel_work as the spinlock
get released and regained in between, let's check the count again before
we queue the delayed work to avoid the race.
Link: https://lkml.kernel.org/r/20210513065458.941403-1-liumartin@google.com
Fixes: 37be45d49dec2 ("kthread: allow to cancel kthread work")
Signed-off-by: Martin Liu <liumartin(a)google.com>
Tested-by: David Chao <davidchao(a)google.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: "Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Jiri Kosina <jkosina(a)suse.cz>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.cz>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: Nick Desaulniers <ndesaulniers(a)google.com>
Cc: <jenhaochen(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kthread.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
--- a/kernel/kthread.c~kthread-fix-kthread_mod_delayed_work-vs-kthread_cancel_delayed_work_sync-race
+++ a/kernel/kthread.c
@@ -1181,6 +1181,19 @@ bool kthread_mod_delayed_work(struct kth
goto out;
ret = __kthread_cancel_work(work, true, &flags);
+
+ /*
+ * Canceling could run in parallel from kthread_cancel_delayed_work_sync
+ * and change work's canceling count as the spinlock is released and regain
+ * in __kthread_cancel_work so we need to check the count again. Otherwise,
+ * we might incorrectly queue the dwork and further cause
+ * cancel_delayed_work_sync thread waiting for flush dwork endlessly.
+ */
+ if (work->canceling) {
+ ret = false;
+ goto out;
+ }
+
fast_queue:
__kthread_queue_delayed_work(worker, dwork, delay);
out:
_
Patches currently in -mm which might be from liumartin(a)google.com are