From: Roberto Sassu <roberto.sassu(a)huawei.com>
Commit 08abce60d63fi ("security: Introduce path_post_mknod hook")
introduced security_path_post_mknod(), to replace the IMA-specific call to
ima_post_path_mknod().
For symmetry with security_path_mknod(), security_path_post_mknod() is
called after a successful mknod operation, for any file type, rather than
only for regular files at the time there was the IMA call.
However, as reported by VFS maintainers, successful mknod operation does
not mean that the dentry always has an inode attached to it (for example,
not for FIFOs on a SAMBA mount).
If that condition happens, the kernel crashes when
security_path_post_mknod() attempts to verify if the inode associated to
the dentry is private.
Add an extra check to first verify if there is an inode attached to the
dentry, before checking if the inode is private. Also add the same check to
the current users of the path_post_mknod hook, ima_post_path_mknod() and
evm_post_path_mknod().
Finally, use the proper helper, d_backing_inode(), to retrieve the inode
from the dentry in ima_post_path_mknod().
Cc: stable(a)vger.kernel.org # 6.8.x
Reported-by: Steve French <smfrench(a)gmail.com>
Closes: https://lore.kernel.org/linux-kernel/CAH2r5msAVzxCUHHG8VKrMPUKQHmBpE6K9_vjh…
Fixes: 08abce60d63fi ("security: Introduce path_post_mknod hook")
Signed-off-by: Roberto Sassu <roberto.sassu(a)huawei.com>
---
security/integrity/evm/evm_main.c | 6 ++++--
security/integrity/ima/ima_main.c | 5 +++--
security/security.c | 4 +++-
3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
index 81dbade5b9b3..ec1659273fcf 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -1037,11 +1037,13 @@ static void evm_file_release(struct file *file)
static void evm_post_path_mknod(struct mnt_idmap *idmap, struct dentry *dentry)
{
struct inode *inode = d_backing_inode(dentry);
- struct evm_iint_cache *iint = evm_iint_inode(inode);
+ struct evm_iint_cache *iint;
- if (!S_ISREG(inode->i_mode))
+ /* path_post_mknod hook might pass dentries without attached inode. */
+ if (!inode || !S_ISREG(inode->i_mode))
return;
+ iint = evm_iint_inode(inode);
if (iint)
iint->flags |= EVM_NEW_FILE;
}
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index c84e8c55333d..afc883e60cf3 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -719,10 +719,11 @@ static void ima_post_create_tmpfile(struct mnt_idmap *idmap,
static void ima_post_path_mknod(struct mnt_idmap *idmap, struct dentry *dentry)
{
struct ima_iint_cache *iint;
- struct inode *inode = dentry->d_inode;
+ struct inode *inode = d_backing_inode(dentry);
int must_appraise;
- if (!ima_policy_flag || !S_ISREG(inode->i_mode))
+ /* path_post_mknod hook might pass dentries without attached inode. */
+ if (!ima_policy_flag || !inode || !S_ISREG(inode->i_mode))
return;
must_appraise = ima_must_appraise(idmap, inode, MAY_ACCESS,
diff --git a/security/security.c b/security/security.c
index 7e118858b545..455f0749e1b0 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1801,7 +1801,9 @@ EXPORT_SYMBOL(security_path_mknod);
*/
void security_path_post_mknod(struct mnt_idmap *idmap, struct dentry *dentry)
{
- if (unlikely(IS_PRIVATE(d_backing_inode(dentry))))
+ /* Not all dentries have an inode attached after mknod. */
+ if (d_backing_inode(dentry) &&
+ unlikely(IS_PRIVATE(d_backing_inode(dentry))))
return;
call_void_hook(path_post_mknod, idmap, dentry);
}
--
2.34.1
Hi stable team,
Please backport the following the commits to 6.7/6.8 to fix
some i915 type-c/thunderbolt PLL issues:
commit 92b47c3b8b24 ("drm/i915: Replace a memset() with zero initialization")
commit ba407525f824 ("drm/i915: Try to preserve the current shared_dpll for fastset on type-c ports")
commit d283ee5662c6 ("drm/i915: Include the PLL name in the debug messages")
commit 33c7760226c7 ("drm/i915: Suppress old PLL pipe_mask checks for MG/TC/TBT PLLs")
6.7 will need two additional dependencies:
commit f215038f4133 ("drm/i915: Use named initializers for DPLL info")
commit 58046e6cf811 ("drm/i915: Stop printing pipe name as hex")
Thanks.
--
Ville Syrjälä
Intel
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 45bcc0346561daa3f59e19a753cc7f3e08e8dff1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024032708-vagrancy-backlash-61dd@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
45bcc0346561 ("selftests: mptcp: diag: return KSFT_FAIL not test_cnt")
ce9902573652 ("selftests: mptcp: diag: format subtests results in TAP")
dc97251bf0b7 ("selftests: mptcp: diag: skip listen tests if not supported")
e04a30f78809 ("selftest: mptcp: add test for mptcp socket in use")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 45bcc0346561daa3f59e19a753cc7f3e08e8dff1 Mon Sep 17 00:00:00 2001
From: Geliang Tang <tanggeliang(a)kylinos.cn>
Date: Fri, 1 Mar 2024 18:11:22 +0100
Subject: [PATCH] selftests: mptcp: diag: return KSFT_FAIL not test_cnt
The test counter 'test_cnt' should not be returned in diag.sh, e.g. what
if only the 4th test fail? Will do 'exit 4' which is 'exit ${KSFT_SKIP}',
the whole test will be marked as skipped instead of 'failed'!
So we should do ret=${KSFT_FAIL} instead.
Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests")
Cc: stable(a)vger.kernel.org
Fixes: 42fb6cddec3b ("selftests: mptcp: more stable diag tests")
Signed-off-by: Geliang Tang <tanggeliang(a)kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
index f300f4e1eb59..18d37d4695c1 100755
--- a/tools/testing/selftests/net/mptcp/diag.sh
+++ b/tools/testing/selftests/net/mptcp/diag.sh
@@ -69,7 +69,7 @@ __chk_nr()
else
echo "[ fail ] expected $expected found $nr"
mptcp_lib_result_fail "${msg}"
- ret=$test_cnt
+ ret=${KSFT_FAIL}
fi
else
echo "[ ok ]"
@@ -124,11 +124,11 @@ wait_msk_nr()
if [ $i -ge $timeout ]; then
echo "[ fail ] timeout while expecting $expected max $max last $nr"
mptcp_lib_result_fail "${msg} # timeout"
- ret=$test_cnt
+ ret=${KSFT_FAIL}
elif [ $nr != $expected ]; then
echo "[ fail ] expected $expected found $nr"
mptcp_lib_result_fail "${msg} # unexpected result"
- ret=$test_cnt
+ ret=${KSFT_FAIL}
else
echo "[ ok ]"
mptcp_lib_result_pass "${msg}"
From: Kan Liang <kan.liang(a)linux.intel.com>
[The patch set is to fix the perf top failure on all Intel hybrid
machines. Without the patch, the default perf top command is broken.
I have verified that the patches on both stable 6.6 and 6.7. They can
be applied to stable 6.6 and 6.7 tree without any modification as well.
Please consider to apply them to stable 6.6 and 6.7. Thanks]
------------------
From: Kan Liang <kan.liang(a)linux.intel.com>
[ Upstream commit 5fa695e7da4975e8d21ce49f3718d6cf00ecb75e ]
perf top errors out on a hybrid machine
$perf top
Error:
The cycles:P event is not supported.
The perf top expects that the "cycles" is collected on all CPUs in the
system. But for hybrid there is no single "cycles" event which can cover
all CPUs. Perf has to split it into two cycles events, e.g.,
cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
If a event is opened on the unsupported CPU. The open fails. That's the
reason of the above error out.
Perf should only open the cycles event on the corresponding CPU. The
commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting
core PMU maps") intersect the requested CPU map with the CPU map of the
PMU. Use the evsel's cpus to replace user_requested_cpus.
The evlist's threads are also propagated to the evsel's threads in
__perf_evlist__propagate_maps(). For a system-wide event, perf appends
a dummy event and assign it to the evsel's threads. For a per-thread
event, the evlist's thread_map is assigned to the evsel's threads. The
same as the other tools, e.g., perf record, using the evsel's threads
when opening an event.
Reported-by: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Reviewed-by: Ian Rogers <irogers(a)google.com>
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Hector Martin <marcan(a)marcan.st>
Cc: Marc Zyngier <maz(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/
Link: https://lore.kernel.org/r/20231214144612.1092028-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/builtin-top.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ea8c7eca5eee..cce9350177e2 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top)
evlist__for_each_entry(evlist, counter) {
try_again:
- if (evsel__open(counter, top->evlist->core.user_requested_cpus,
- top->evlist->core.threads) < 0) {
+ if (evsel__open(counter, counter->core.cpus,
+ counter->core.threads) < 0) {
/*
* Specially handle overwrite fall back.
--
2.34.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
From 35f20786c481d5ced9283ff42de5c69b65e5ed13 Mon Sep 17 00:00:00 2001
From: Nathan Chancellor <nathan(a)kernel.org>
Date: Sat, 27 Jan 2024 11:07:43 -0700
Subject: [PATCH] powerpc: xor_vmx: Add '-mhard-float' to CFLAGS
arch/powerpc/lib/xor_vmx.o is built with '-msoft-float' (from the main
powerpc Makefile) and '-maltivec' (from its CFLAGS), which causes an
error when building with clang after a recent change in main:
error: option '-msoft-float' cannot be specified with '-maltivec'
make[6]: *** [scripts/Makefile.build:243: arch/powerpc/lib/xor_vmx.o] Error 1
Explicitly add '-mhard-float' before '-maltivec' in xor_vmx.o's CFLAGS
to override the previous inclusion of '-msoft-float' (as the last option
wins), which matches how other areas of the kernel use '-maltivec', such
as AMDGPU.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/1986
Link: https://github.com/llvm/llvm-project/commit/4792f912b232141ecba4cbae538873b…
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/20240127-ppc-xor_vmx-drop-msoft-float-v1-1-f24140e81376@…
---
arch/powerpc/lib/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 6eac63e79a899..0ab65eeb93ee3 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -76,7 +76,7 @@ obj-$(CONFIG_PPC_LIB_RHEAP) += rheap.o
obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
obj-$(CONFIG_ALTIVEC) += xor_vmx.o xor_vmx_glue.o
-CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
+CFLAGS_xor_vmx.o += -mhard-float -maltivec $(call cc-option,-mabi=altivec)
# Enable <altivec.h>
CFLAGS_xor_vmx.o += -isystem $(shell $(CC) -print-file-name=include)
--
2.43.0
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Thanks,
Sasha
------------------ original commit in Linus's tree ------------------
From 38b43539d64b2fa020b3b9a752a986769f87f7a6 Mon Sep 17 00:00:00 2001
From: Tony Battersby <tonyb(a)cybernetics.com>
Date: Thu, 29 Feb 2024 13:08:09 -0500
Subject: [PATCH] block: Fix page refcounts for unaligned buffers in
__bio_release_pages()
Fix an incorrect number of pages being released for buffers that do not
start at the beginning of a page.
Fixes: 1b151e2435fc ("block: Remove special-casing of compound pages")
Cc: stable(a)vger.kernel.org
Signed-off-by: Tony Battersby <tonyb(a)cybernetics.com>
Tested-by: Greg Edwards <gedwards(a)ddn.com>
Link: https://lore.kernel.org/r/86e592a9-98d4-4cff-a646-0c0084328356@cybernetics.…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
block/bio.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 496867b51609f..a8b6919400270 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1153,7 +1153,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
bio_for_each_folio_all(fi, bio) {
struct page *page;
- size_t done = 0;
+ size_t nr_pages;
if (mark_dirty) {
folio_lock(fi.folio);
@@ -1161,10 +1161,11 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
folio_unlock(fi.folio);
}
page = folio_page(fi.folio, fi.offset / PAGE_SIZE);
+ nr_pages = (fi.offset + fi.length - 1) / PAGE_SIZE -
+ fi.offset / PAGE_SIZE + 1;
do {
bio_release_page(bio, page++);
- done += PAGE_SIZE;
- } while (done < fi.length);
+ } while (--nr_pages != 0);
}
}
EXPORT_SYMBOL_GPL(__bio_release_pages);
--
2.43.0
From: Chengming Zhou <zhouchengming(a)bytedance.com>
commit e5c0ca13659e9d18f53368d651ed7e6e433ec1cf upstream.
Chuck reported [1] an IO hang problem on NFS exports that reside on SATA
devices and bisected to commit 615939a2ae73 ("blk-mq: defer to the normal
submission path for post-flush requests").
We analysed the IO hang problem, found there are two postflush requests
waiting for each other.
The first postflush request completed the REQ_FSEQ_DATA sequence, so go to
the REQ_FSEQ_POSTFLUSH sequence and added in the flush pending list, but
failed to blk_kick_flush() because of the second postflush request which
is inflight waiting in scheduler queue.
The second postflush waiting in scheduler queue can't be dispatched because
the first postflush hasn't released scheduler resource even though it has
completed by itself.
Fix it by releasing scheduler resource when the first postflush request
completed, so the second postflush can be dispatched and completed, then
make blk_kick_flush() succeed.
While at it, remove the check for e->ops.finish_request, as all
schedulers set that. Reaffirm this requirement by adding a WARN_ON_ONCE()
at scheduler registration time, just like we do for insert_requests and
dispatch_request.
[1] https://lore.kernel.org/all/7A57C7AE-A51A-4254-888B-FE15CA21F9E9@oracle.com/
Link: https://lore.kernel.org/linux-block/20230819031206.2744005-1-chengming.zhou…
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202308172100.8ce4b853-oliver.sang@intel.com
Fixes: 615939a2ae73 ("blk-mq: defer to the normal submission path for post-flush requests")
Reported-by: Chuck Lever <chuck.lever(a)oracle.com>
Signed-off-by: Chengming Zhou <zhouchengming(a)bytedance.com>
Tested-by: Chuck Lever <chuck.lever(a)oracle.com>
Link: https://lore.kernel.org/r/20230813152325.3017343-1-chengming.zhou@linux.dev
[axboe: folded in incremental fix and added tags]
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
[bvanassche: changed RQF_USE_SCHED into RQF_ELVPRIV; restored the
finish_request pointer check before calling finish_request and removed
the new warning from the elevator code. This patch fixes an I/O hang
when submitting a REQ_FUA request to a request queue for a zoned block
device for which FUA has been disabled (QUEUE_FLAG_FUA is not set).]
Signed-off-by: Bart Van Assche <bvanassche(a)acm.org>
---
block/blk-mq.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7ed6b9469f97..07610505c177 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -675,6 +675,22 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
}
EXPORT_SYMBOL_GPL(blk_mq_alloc_request_hctx);
+static void blk_mq_finish_request(struct request *rq)
+{
+ struct request_queue *q = rq->q;
+
+ if ((rq->rq_flags & RQF_ELVPRIV) &&
+ q->elevator->type->ops.finish_request) {
+ q->elevator->type->ops.finish_request(rq);
+ /*
+ * For postflush request that may need to be
+ * completed twice, we should clear this flag
+ * to avoid double finish_request() on the rq.
+ */
+ rq->rq_flags &= ~RQF_ELVPRIV;
+ }
+}
+
static void __blk_mq_free_request(struct request *rq)
{
struct request_queue *q = rq->q;
@@ -701,9 +717,7 @@ void blk_mq_free_request(struct request *rq)
{
struct request_queue *q = rq->q;
- if ((rq->rq_flags & RQF_ELVPRIV) &&
- q->elevator->type->ops.finish_request)
- q->elevator->type->ops.finish_request(rq);
+ blk_mq_finish_request(rq);
if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq)))
laptop_io_completion(q->disk->bdi);
@@ -1025,6 +1039,8 @@ inline void __blk_mq_end_request(struct request *rq, blk_status_t error)
if (blk_mq_need_time_stamp(rq))
__blk_mq_end_request_acct(rq, ktime_get_ns());
+ blk_mq_finish_request(rq);
+
if (rq->end_io) {
rq_qos_done(rq->q, rq);
if (rq->end_io(rq, error) == RQ_END_IO_FREE)
@@ -1079,6 +1095,8 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob)
if (iob->need_ts)
__blk_mq_end_request_acct(rq, now);
+ blk_mq_finish_request(rq);
+
rq_qos_done(rq->q, rq);
/*
commit f45812cc23fb74bef62d4eb8a69fe7218f4b9f2a upstream.
Work around a quirk in a few old (2011-ish) UEFI implementations, where
a call to `GetNextVariableName` with a buffer size larger than 512 bytes
will always return EFI_INVALID_PARAMETER.
There is some lore around EFI variable names being up to 1024 bytes in
size, but this has no basis in the UEFI specification, and the upper
bounds are typically platform specific, and apply to the entire variable
(name plus payload).
Given that Linux does not permit creating files with names longer than
NAME_MAX (255) bytes, 512 bytes (== 256 UTF-16 characters) is a
reasonable limit.
Cc: <stable(a)vger.kernel.org> # 6.1+
Signed-off-by: Tim Schumacher <timschumi(a)gmx.de>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
[timschumi(a)gmx.de: adjusted diff for changed context and code move]
Signed-off-by: Tim Schumacher <timschumi(a)gmx.de>
---
Please apply this patch to stable kernel 5.15, 5.10, 5.4, and 4.19
respectively. Kernel 6.1 and upwards were already handled via CC,
5.15 and below required a separate patch due to a slight refactor of
surrounding code in bbc6d2c6ef22 ("efi: vars: Switch to new wrapper
layer") and a subsequent code move in 2d82e6227ea1 ("efi: vars: Move
efivar caching layer into efivarfs").
Please note that the upper Signed-off-by tags are remnants from the
original patch, I documented my modifications below them and added
another sign-off. As far as I was able to gather, this is the expected
format for diverged stable patches.
I'm not sure on the specifics of manual stable backports, so let me
know in case anything doesn't follow the process. The linux-efi team
and list are on CC both for documentation/review purposes and in case
a new sign-off/ack of theirs is required.
---
drivers/firmware/efi/vars.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c
index cae590bd08f2..eaed1ddcc803 100644
--- a/drivers/firmware/efi/vars.c
+++ b/drivers/firmware/efi/vars.c
@@ -415,7 +415,7 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
void *data, bool duplicates, struct list_head *head)
{
const struct efivar_operations *ops;
- unsigned long variable_name_size = 1024;
+ unsigned long variable_name_size = 512;
efi_char16_t *variable_name;
efi_status_t status;
efi_guid_t vendor_guid;
@@ -438,12 +438,13 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
}
/*
- * Per EFI spec, the maximum storage allocated for both
- * the variable name and variable data is 1024 bytes.
+ * A small set of old UEFI implementations reject sizes
+ * above a certain threshold, the lowest seen in the wild
+ * is 512.
*/
do {
- variable_name_size = 1024;
+ variable_name_size = 512;
status = ops->get_next_variable(&variable_name_size,
variable_name,
@@ -491,9 +492,13 @@ int efivar_init(int (*func)(efi_char16_t *, efi_guid_t, unsigned long, void *),
break;
case EFI_NOT_FOUND:
break;
+ case EFI_BUFFER_TOO_SMALL:
+ pr_warn("efivars: Variable name size exceeds maximum (%lu > 512)\n",
+ variable_name_size);
+ status = EFI_NOT_FOUND;
+ break;
default:
- printk(KERN_WARNING "efivars: get_next_variable: status=%lx\n",
- status);
+ pr_warn("efivars: get_next_variable: status=%lx\n", status);
status = EFI_NOT_FOUND;
break;
}
--
2.44.0
From: Yang Jihong <yangjihong1(a)huawei.com>
commit 6b959ba22d34ca793ffdb15b5715457c78e38b1a upstream.
perf_output_read_group may respond to IPI request of other cores and invoke
__perf_install_in_context function. As a result, hwc configuration is modified.
causing inconsistency and unexpected consequences.
Interrupts are not disabled when perf_output_read_group reads PMU counter.
In this case, IPI request may be received from other cores.
As a result, PMU configuration is modified and an error occurs when
reading PMU counter:
CPU0 CPU1
__se_sys_perf_event_open
perf_install_in_context
perf_output_read_group smp_call_function_single
for_each_sibling_event(sub, leader) { generic_exec_single
if ((sub != event) && remote_function
(sub->state == PERF_EVENT_STATE_ACTIVE)) |
<enter IPI handler: __perf_install_in_context> <----RAISE IPI-----+
__perf_install_in_context
ctx_resched
event_sched_out
armpmu_del
...
hwc->idx = -1; // event->hwc.idx is set to -1
...
<exit IPI>
sub->pmu->read(sub);
armpmu_read
armv8pmu_read_counter
armv8pmu_read_hw_counter
int idx = event->hw.idx; // idx = -1
u64 val = armv8pmu_read_evcntr(idx);
u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30
read_pmevcntrn(counter) // undefined instruction
Signed-off-by: Yang Jihong <yangjihong1(a)huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Link: https://lkml.kernel.org/r/20220902082918.179248-1-yangjihong1@huawei.com
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
---
This race may also lead to observed behavior like RCU stalls, hang tasks,
OOM. Likely due to list corruption or a similar root cause.
---
kernel/events/core.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4e5a73c7db12..e79cd0fd1d2b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7119,9 +7119,16 @@ static void perf_output_read_group(struct perf_output_handle *handle,
{
struct perf_event *leader = event->group_leader, *sub;
u64 read_format = event->attr.read_format;
+ unsigned long flags;
u64 values[6];
int n = 0;
+ /*
+ * Disabling interrupts avoids all counter scheduling
+ * (context switches, timer based rotation and IPIs).
+ */
+ local_irq_save(flags);
+
values[n++] = 1 + leader->nr_siblings;
if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -7157,6 +7164,8 @@ static void perf_output_read_group(struct perf_output_handle *handle,
__output_copy(handle, values, n * sizeof(u64));
}
+
+ local_irq_restore(flags);
}
#define PERF_FORMAT_TOTAL_TIMES (PERF_FORMAT_TOTAL_TIME_ENABLED|\
--
2.34.1