This is the start of the stable review cycle for the 5.15.74 release.
There are 27 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 15 Oct 2022 17:51:33 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.74-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.15.74-rc1
Shunsuke Mie <mie(a)igel.co.jp>
misc: pci_endpoint_test: Fix pci_endpoint_test_{copy,write,read}() panic
Shunsuke Mie <mie(a)igel.co.jp>
misc: pci_endpoint_test: Aggregate params checking for xfer
Cameron Gutman <aicommander(a)gmail.com>
Input: xpad - fix wireless 360 controller breaking after suspend
Pavel Rojtberg <rojtberg(a)gmail.com>
Input: xpad - add supported devices as contributed on github
Johannes Berg <johannes.berg(a)intel.com>
wifi: cfg80211: update hidden BSSes to avoid WARN_ON
Johannes Berg <johannes.berg(a)intel.com>
wifi: mac80211: fix crash in beacon protection for P2P-device
Johannes Berg <johannes.berg(a)intel.com>
wifi: mac80211_hwsim: avoid mac80211 warning on bad rate
Johannes Berg <johannes.berg(a)intel.com>
wifi: cfg80211: avoid nontransmitted BSS list corruption
Johannes Berg <johannes.berg(a)intel.com>
wifi: cfg80211: fix BSS refcounting bugs
Johannes Berg <johannes.berg(a)intel.com>
wifi: cfg80211: ensure length byte is present before access
Johannes Berg <johannes.berg(a)intel.com>
wifi: cfg80211/mac80211: reject bad MBSSID elements
Johannes Berg <johannes.berg(a)intel.com>
wifi: cfg80211: fix u8 overflow in cfg80211_update_notlisted_nontrans()
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: use expired timer rather than wq for mixing fast pool
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: avoid reading two cache lines on irq randomness
Giovanni Cabiddu <giovanni.cabiddu(a)intel.com>
Revert "crypto: qat - reduce size of mapped region"
Nathan Lynch <nathanl(a)linux.ibm.com>
Revert "powerpc/rtas: Implement reentrant rtas call"
Frank Wunderlich <frank-w(a)public-files.de>
USB: serial: qcserial: add new usb-id for Dell branded EM7455
Linus Torvalds <torvalds(a)linux-foundation.org>
scsi: stex: Properly zero out the passthrough command structure
Orlando Chamberlain <redecorating(a)protonmail.com>
efi: Correct Macmini DMI match in uefi cert quirk
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda: Fix position reporting on Poulsbo
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: clamp credited irq bits to maximum mixed
Jason A. Donenfeld <Jason(a)zx2c4.com>
random: restore O_NONBLOCK support
Hu Weiwen <sehuww(a)mail.scut.edu.cn>
ceph: don't truncate file in atomic_open
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: fix leak of nilfs_root in case of writer thread creation failure
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: fix use-after-free bug of struct nilfs_root
Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
nilfs2: fix NULL pointer dereference at nilfs_bmap_lookup_at_level()
-------------
Diffstat:
Makefile | 4 +-
arch/powerpc/include/asm/paca.h | 1 -
arch/powerpc/include/asm/rtas.h | 1 -
arch/powerpc/kernel/paca.c | 32 -----------
arch/powerpc/kernel/rtas.c | 54 -------------------
arch/powerpc/sysdev/xics/ics-rtas.c | 22 ++++----
drivers/char/mem.c | 4 +-
drivers/char/random.c | 25 ++++++---
drivers/crypto/qat/qat_common/qat_asym_algs.c | 12 ++---
drivers/input/joystick/xpad.c | 20 ++++++-
drivers/misc/pci_endpoint_test.c | 34 +++++++++---
drivers/net/wireless/mac80211_hwsim.c | 2 +
drivers/scsi/stex.c | 17 +++---
drivers/usb/serial/qcserial.c | 1 +
fs/ceph/file.c | 10 ++--
fs/nilfs2/inode.c | 19 ++++++-
fs/nilfs2/segment.c | 21 +++++---
include/scsi/scsi_cmnd.h | 2 +-
net/mac80211/rx.c | 12 +++--
net/mac80211/util.c | 2 +
net/wireless/scan.c | 77 +++++++++++++++++----------
security/integrity/platform_certs/load_uefi.c | 2 +-
sound/pci/hda/hda_intel.c | 3 +-
23 files changed, 198 insertions(+), 179 deletions(-)
From: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
[ Upstream commit a7c01fa93aeb03ab76cd3cb2107990dd160498e6 ]
I was recently surprised to learn that msleep_interruptible(),
wait_for_completion_interruptible_timeout(), and related functions
simply hung when I called kthread_stop() on kthreads using them. The
solution to fixing the case with msleep_interruptible() was more simply
to move to schedule_timeout_interruptible(). Why?
The reason is that msleep_interruptible(), and many functions just like
it, has a loop like this:
while (timeout && !signal_pending(current))
timeout = schedule_timeout_interruptible(timeout);
The call to kthread_stop() woke up the thread, so schedule_timeout_
interruptible() returned early, but because signal_pending() returned
true, it went back into another timeout, which was never woken up.
This wait loop pattern is common to various pieces of code, and I
suspect that the subtle misuse in a kthread that caused a deadlock in
the code I looked at last week is also found elsewhere.
So this commit causes signal_pending() to return true when
kthread_stop() is called, by setting TIF_NOTIFY_SIGNAL.
The same also probably applies to the similar kthread_park()
functionality, but that can be addressed later, as its semantics are
slightly different.
Cc: Eric W. Biederman <ebiederm(a)xmission.com>
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
v1: https://lkml.kernel.org/r/20220627120020.608117-1-Jason@zx2c4.com
v2: https://lkml.kernel.org/r/20220627145716.641185-1-Jason@zx2c4.com
v3: https://lkml.kernel.org/r/20220628161441.892925-1-Jason@zx2c4.com
v4: https://lkml.kernel.org/r/20220711202136.64458-1-Jason@zx2c4.com
v5: https://lkml.kernel.org/r/20220711232123.136330-1-Jason@zx2c4.com
Signed-off-by: Eric W. Biederman <ebiederm(a)xmission.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
kernel/kthread.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 5b37a8567168..c8ca1007e2dd 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -644,6 +644,7 @@ int kthread_stop(struct task_struct *k)
kthread = to_kthread(k);
set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
kthread_unpark(k);
+ set_tsk_thread_flag(k, TIF_NOTIFY_SIGNAL);
wake_up_process(k);
wait_for_completion(&kthread->exited);
ret = k->exit_code;
--
2.35.1
If an error is detected as a result of user-space process accessing a
corrupt memory location, the CPU may take an abort. Then the platform
firmware reports kernel via NMI like notifications, e.g. NOTIFY_SEA,
NOTIFY_SOFTWARE_DELEGATED, etc.
For NMI like notifications, commit 7f17b4a121d0 ("ACPI: APEI: Kick the
memory_failure() queue for synchronous errors") keep track of whether
memory_failure() work was queued, and make task_work pending to flush out
the queue so that the work is processed before return to user-space.
The code use init_mm to check whether the error occurs in user space:
if (current->mm != &init_mm)
The condition is always true, becase _nobody_ ever has "init_mm" as a real
VM any more.
In addition to abort, errors can also be signaled as asynchronous
exceptions, such as interrupt and SError. In such case, the interrupted
current process could be any kind of thread. When a kernel thread is
interrupted, the work ghes_kick_task_work deferred to task_work will never
be processed because entry_handler returns to call ret_to_kernel() instead
of ret_to_user(). Consequently, the estatus_node alloced from
ghes_estatus_pool in ghes_in_nmi_queue_one_entry() will not be freed.
After around 200 allocations in our platform, the ghes_estatus_pool will
run of memory and ghes_in_nmi_queue_one_entry() returns ENOMEM. As a
result, the event failed to be processed.
sdei: event 805 on CPU 113 failed with error: -2
Finally, a lot of unhandled events may cause platform firmware to exceed
some threshold and reboot.
The condition should generally just do
if (current->mm)
as described in active_mm.rst documentation.
Then if an asynchronous error is detected when a kernel thread is running,
(e.g. when detected by a background scrubber), do not add task_work to it
as the original patch intends to do.
Fixes: 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors")
Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com>
---
changes since v1:
- add description the side effect and give more details
drivers/acpi/apei/ghes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d91ad378c00d..80ad530583c9 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -985,7 +985,7 @@ static void ghes_proc_in_irq(struct irq_work *irq_work)
ghes_estatus_cache_add(generic, estatus);
}
- if (task_work_pending && current->mm != &init_mm) {
+ if (task_work_pending && current->mm) {
estatus_node->task_work.func = ghes_kick_task_work;
estatus_node->task_work_cpu = smp_processor_id();
ret = task_work_add(current, &estatus_node->task_work,
--
2.20.1.12.g72788fdb