From: Pavel Skripkin <paskripkin(a)gmail.com>
commit 6f68cd634856f8ca93bafd623ba5357e0f648c68 upstream.
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was
in wrong error handling in batadv_mesh_init().
Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case
of any batadv_*_init() calls failure. This approach may work well, when
there is some kind of indicator, which can tell which parts of batadv are
initialized; but there isn't any.
All written above lead to cleaning up uninitialized fields. Even if we hide
ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit
GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]
To fix these bugs we can unwind batadv_*_init() calls one by one.
It is good approach for 2 reasons: 1) It fixes bugs on error handling
path 2) It improves the performance, since we won't call unneeded
batadv_*_free() functions.
So, this patch makes all batadv_*_init() clean up all allocated memory
before returning with an error to no call correspoing batadv_*_free()
and open-codes batadv_mesh_free() with proper order to avoid touching
uninitialized fields.
Link: https://lore.kernel.org/netdev/000000000000c87fbd05cef6bcb0@google.com/ [1]
Reported-and-tested-by: syzbot+28b0702ada0bf7381f58(a)syzkaller.appspotmail.com
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol")
Signed-off-by: Pavel Skripkin <paskripkin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
[ bp: 4.4 backport: Drop batadv_v_mesh_{init,free} which are not there yet. ]
Signed-off-by: Sven Eckelmann <sven(a)narfation.org>
---
Submission according to the request in
https://lore.kernel.org/all/163559888490194@kroah.com/
net/batman-adv/bridge_loop_avoidance.c | 8 +++--
net/batman-adv/main.c | 44 +++++++++++++++++++-------
net/batman-adv/network-coding.c | 4 ++-
net/batman-adv/translation-table.c | 4 ++-
4 files changed, 44 insertions(+), 16 deletions(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c
index 1267cbb1a329..5e59a6ecae42 100644
--- a/net/batman-adv/bridge_loop_avoidance.c
+++ b/net/batman-adv/bridge_loop_avoidance.c
@@ -1346,10 +1346,14 @@ int batadv_bla_init(struct batadv_priv *bat_priv)
return 0;
bat_priv->bla.claim_hash = batadv_hash_new(128);
- bat_priv->bla.backbone_hash = batadv_hash_new(32);
+ if (!bat_priv->bla.claim_hash)
+ return -ENOMEM;
- if (!bat_priv->bla.claim_hash || !bat_priv->bla.backbone_hash)
+ bat_priv->bla.backbone_hash = batadv_hash_new(32);
+ if (!bat_priv->bla.backbone_hash) {
+ batadv_hash_destroy(bat_priv->bla.claim_hash);
return -ENOMEM;
+ }
batadv_hash_set_lock_class(bat_priv->bla.claim_hash,
&batadv_claim_hash_lock_class_key);
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index 88cea5154113..8ba7b86579d4 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -159,24 +159,34 @@ int batadv_mesh_init(struct net_device *soft_iface)
INIT_HLIST_HEAD(&bat_priv->softif_vlan_list);
ret = batadv_originator_init(bat_priv);
- if (ret < 0)
- goto err;
+ if (ret < 0) {
+ atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING);
+ goto err_orig;
+ }
ret = batadv_tt_init(bat_priv);
- if (ret < 0)
- goto err;
+ if (ret < 0) {
+ atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING);
+ goto err_tt;
+ }
ret = batadv_bla_init(bat_priv);
- if (ret < 0)
- goto err;
+ if (ret < 0) {
+ atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING);
+ goto err_bla;
+ }
ret = batadv_dat_init(bat_priv);
- if (ret < 0)
- goto err;
+ if (ret < 0) {
+ atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING);
+ goto err_dat;
+ }
ret = batadv_nc_mesh_init(bat_priv);
- if (ret < 0)
- goto err;
+ if (ret < 0) {
+ atomic_set(&bat_priv->mesh_state, BATADV_MESH_DEACTIVATING);
+ goto err_nc;
+ }
batadv_gw_init(bat_priv);
batadv_mcast_init(bat_priv);
@@ -186,8 +196,18 @@ int batadv_mesh_init(struct net_device *soft_iface)
return 0;
-err:
- batadv_mesh_free(soft_iface);
+err_nc:
+ batadv_dat_free(bat_priv);
+err_dat:
+ batadv_bla_free(bat_priv);
+err_bla:
+ batadv_tt_free(bat_priv);
+err_tt:
+ batadv_originator_free(bat_priv);
+err_orig:
+ batadv_purge_outstanding_packets(bat_priv, NULL);
+ atomic_set(&bat_priv->mesh_state, BATADV_MESH_INACTIVE);
+
return ret;
}
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index 91de807a8f03..9317d872b9c0 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -159,8 +159,10 @@ int batadv_nc_mesh_init(struct batadv_priv *bat_priv)
&batadv_nc_coding_hash_lock_class_key);
bat_priv->nc.decoding_hash = batadv_hash_new(128);
- if (!bat_priv->nc.decoding_hash)
+ if (!bat_priv->nc.decoding_hash) {
+ batadv_hash_destroy(bat_priv->nc.coding_hash);
goto err;
+ }
batadv_hash_set_lock_class(bat_priv->nc.decoding_hash,
&batadv_nc_decoding_hash_lock_class_key);
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index 5f976485e8c6..1ad90267064d 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -3833,8 +3833,10 @@ int batadv_tt_init(struct batadv_priv *bat_priv)
return ret;
ret = batadv_tt_global_init(bat_priv);
- if (ret < 0)
+ if (ret < 0) {
+ batadv_tt_local_table_free(bat_priv);
return ret;
+ }
batadv_tvlv_handler_register(bat_priv, batadv_tt_tvlv_ogm_handler_v1,
batadv_tt_tvlv_unicast_handler_v1,
--
2.30.2
This patch is for linux-5.10.y only.
When scripts/lld-version.sh was initially written, it did not account
for the LLD_VENDOR cmake flag, which changes the output of ld.lld's
--version flag slightly.
Without LLD_VENDOR:
$ ld.lld --version
LLD 14.0.0 (compatible with GNU linkers)
With LLD_VENDOR:
$ ld.lld --version
Debian LLD 14.0.0 (compatible with GNU linkers)
As a result, CONFIG_LLD_VERSION is messed up and configuration values
that are dependent on it cannot be selected:
scripts/lld-version.sh: 20: printf: LLD: expected numeric value
scripts/lld-version.sh: 20: printf: LLD: expected numeric value
scripts/lld-version.sh: 20: printf: LLD: expected numeric value
init/Kconfig:52:warning: 'LLD_VERSION': number is invalid
.config:11:warning: symbol value '00000' invalid for LLD_VERSION
.config:8800:warning: override: CPU_BIG_ENDIAN changes choice state
This was fixed upstream by commit 1f09af062556 ("kbuild: Fix
ld-version.sh script if LLD was built with LLD_VENDOR") in 5.12 but that
was done to ld-version.sh after it was massively rewritten in
commit 02aff8592204 ("kbuild: check the minimum linker version in
Kconfig").
To avoid bringing in that change plus its prerequisites and fixes, just
modify lld-version.sh to make it similar to the upstream ld-version.sh,
which handles ld.lld with or without LLD_VENDOR and ld.bfd without any
errors.
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
Our CI caught this error with newer versions of Debian's ld.lld, which
added LLD_VENDOR it seems:
https://github.com/ClangBuiltLinux/continuous-integration2/runs/4206343929?…
A similar change was done by me for Android, where it has seen no
issues:
https://android-review.googlesource.com/c/kernel/common/+/1744324
I believe this is a safer change than backporting the fixes from
upstream but if you guys feel otherwise, I can do so. This is only
needed in 5.10 as CONFIG_LLD_VERSION does not exist in 5.4 and it was
fixed in 5.12 upstream.
scripts/lld-version.sh | 35 ++++++++++++++++++++++++++---------
1 file changed, 26 insertions(+), 9 deletions(-)
diff --git a/scripts/lld-version.sh b/scripts/lld-version.sh
index d70edb4d8a4f..f1eeee450a23 100755
--- a/scripts/lld-version.sh
+++ b/scripts/lld-version.sh
@@ -6,15 +6,32 @@
# Print the linker version of `ld.lld' in a 5 or 6-digit form
# such as `100001' for ld.lld 10.0.1 etc.
-linker_string="$($* --version)"
+set -e
-if ! ( echo $linker_string | grep -q LLD ); then
+# Convert the version string x.y.z to a canonical 5 or 6-digit form.
+get_canonical_version()
+{
+ IFS=.
+ set -- $1
+
+ # If the 2nd or 3rd field is missing, fill it with a zero.
+ echo $((10000 * $1 + 100 * ${2:-0} + ${3:-0}))
+}
+
+# Get the first line of the --version output.
+IFS='
+'
+set -- $(LC_ALL=C "$@" --version)
+
+# Split the line on spaces.
+IFS=' '
+set -- $1
+
+while [ $# -gt 1 -a "$1" != "LLD" ]; do
+ shift
+done
+if [ "$1" = LLD ]; then
+ echo $(get_canonical_version ${2%-*})
+else
echo 0
- exit 1
fi
-
-VERSION=$(echo $linker_string | cut -d ' ' -f 2)
-MAJOR=$(echo $VERSION | cut -d . -f 1)
-MINOR=$(echo $VERSION | cut -d . -f 2)
-PATCHLEVEL=$(echo $VERSION | cut -d . -f 3)
-printf "%d%02d%02d\\n" $MAJOR $MINOR $PATCHLEVEL
base-commit: bd816c278316f20a5575debc64dde4422229a880
--
2.34.0.rc0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 86432a6dca9bed79111990851df5756d3eb5f57c Mon Sep 17 00:00:00 2001
From: Gao Xiang <hsiangkao(a)linux.alibaba.com>
Date: Thu, 4 Nov 2021 02:20:06 +0800
Subject: [PATCH] erofs: fix unsafe pagevec reuse of hooked pclusters
There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL
before actual I/O submission. Thus, the decompression chain can be
extended if the following pcluster chain hooks such tail pcluster.
As the related comment mentioned, if some page is made of a hooked
pcluster and another followed pcluster, it can be reused for in-place
I/O (since I/O should be submitted anyway):
_______________________________________________________________
| tail (partial) page | head (partial) page |
|_____PRIMARY_HOOKED___|____________PRIMARY_FOLLOWED____________|
However, it's by no means safe to reuse as pagevec since if such
PRIMARY_HOOKED pclusters finally move into bypass chain without I/O
submission. It's somewhat hard to reproduce with LZ4 and I just found
it (general protection fault) by ro_fsstressing a LZMA image for long
time.
I'm going to actively clean up related code together with multi-page
folio adaption in the next few months. Let's address it directly for
easier backporting for now.
Call trace for reference:
z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs]
z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs]
z_erofs_runqueue+0x5f3/0x840 [erofs]
z_erofs_readahead+0x1e8/0x320 [erofs]
read_pages+0x91/0x270
page_cache_ra_unbounded+0x18b/0x240
filemap_get_pages+0x10a/0x5f0
filemap_read+0xa9/0x330
new_sync_read+0x11b/0x1a0
vfs_read+0xf1/0x190
Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org
Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
Cc: <stable(a)vger.kernel.org> # 4.19+
Reviewed-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Gao Xiang <hsiangkao(a)linux.alibaba.com>
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 11c7a1aaebad..eb51df4a9f77 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -373,8 +373,8 @@ static bool z_erofs_try_inplace_io(struct z_erofs_collector *clt,
/* callers must be with collection lock held */
static int z_erofs_attach_page(struct z_erofs_collector *clt,
- struct page *page,
- enum z_erofs_page_type type)
+ struct page *page, enum z_erofs_page_type type,
+ bool pvec_safereuse)
{
int ret;
@@ -384,9 +384,9 @@ static int z_erofs_attach_page(struct z_erofs_collector *clt,
z_erofs_try_inplace_io(clt, page))
return 0;
- ret = z_erofs_pagevec_enqueue(&clt->vector, page, type);
+ ret = z_erofs_pagevec_enqueue(&clt->vector, page, type,
+ pvec_safereuse);
clt->cl->vcnt += (unsigned int)ret;
-
return ret ? 0 : -EAGAIN;
}
@@ -729,7 +729,8 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
tight &= (clt->mode >= COLLECT_PRIMARY_FOLLOWED);
retry:
- err = z_erofs_attach_page(clt, page, page_type);
+ err = z_erofs_attach_page(clt, page, page_type,
+ clt->mode >= COLLECT_PRIMARY_FOLLOWED);
/* should allocate an additional short-lived page for pagevec */
if (err == -EAGAIN) {
struct page *const newpage =
@@ -737,7 +738,7 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
set_page_private(newpage, Z_EROFS_SHORTLIVED_PAGE);
err = z_erofs_attach_page(clt, newpage,
- Z_EROFS_PAGE_TYPE_EXCLUSIVE);
+ Z_EROFS_PAGE_TYPE_EXCLUSIVE, true);
if (!err)
goto retry;
}
diff --git a/fs/erofs/zpvec.h b/fs/erofs/zpvec.h
index dfd7fe0503bb..b05464f4a808 100644
--- a/fs/erofs/zpvec.h
+++ b/fs/erofs/zpvec.h
@@ -106,11 +106,18 @@ static inline void z_erofs_pagevec_ctor_init(struct z_erofs_pagevec_ctor *ctor,
static inline bool z_erofs_pagevec_enqueue(struct z_erofs_pagevec_ctor *ctor,
struct page *page,
- enum z_erofs_page_type type)
+ enum z_erofs_page_type type,
+ bool pvec_safereuse)
{
- if (!ctor->next && type)
- if (ctor->index + 1 == ctor->nr)
+ if (!ctor->next) {
+ /* some pages cannot be reused as pvec safely without I/O */
+ if (type == Z_EROFS_PAGE_TYPE_EXCLUSIVE && !pvec_safereuse)
+ type = Z_EROFS_VLE_PAGE_TYPE_TAIL_SHARED;
+
+ if (type != Z_EROFS_PAGE_TYPE_EXCLUSIVE &&
+ ctor->index + 1 == ctor->nr)
return false;
+ }
if (ctor->index >= ctor->nr)
z_erofs_pagevec_ctor_pagedown(ctor, false);
Hi, Thomas,
On Fri, 5 Nov 2021 at 13:14, Thomas Gleixner <tglx(a)linutronix.de> wrote:
>
> On Thu, Nov 04 2021 at 18:01, Marc Zyngier wrote:
> > Rui reported[1] that his Nvidia ION system stopped working with 5.15,
> > with the AHCI device failing to get any MSI. A rapid investigation
> > revealed that although the device doesn't advertise MSI masking, it
> > actually needs it. Quality hardware indeed.
> >
> > Anyway, the couple of patches below are an attempt at dealing with the
> > issue in a more or less generic way.
> >
> > [1] https://lore.kernel.org/r/CALjTZvbzYfBuLB+H=fj2J+9=DxjQ2Uqcy0if_PvmJ-nU-qEg…
> >
> > Marc Zyngier (2):
> > PCI: MSI: Deal with devices lying about their MSI mask capability
> > PCI: Add MSI masking quirk for Nvidia ION AHCI
> >
> > drivers/pci/msi.c | 3 +++
> > drivers/pci/quirks.c | 6 ++++++
> > include/linux/pci.h | 2 ++
> > 3 files changed, 11 insertions(+)
>
> Groan.
>
> Reviewed-by: Thomas Gleixner <tglx(a)linutronix.de>
Just a reminder, to make sure this doesn't fall through the cracks.
It's already in 5.16, but needs to be backported to 5.15. I'm not
seeing it in Greg's 5.15 stable queue yet.
Thanks,
Rui
The patch below does not apply to the 5.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3735459037114d31e5acd9894fad9aed104231a0 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Tue, 9 Nov 2021 14:53:57 +0100
Subject: [PATCH] PCI/MSI: Destroy sysfs before freeing entries
free_msi_irqs() frees the MSI entries before destroying the sysfs entries
which are exposing them. Nothing prevents a concurrent free while a sysfs
file is read and accesses the possibly freed entry.
Move the sysfs release ahead of freeing the entries.
Fixes: 1c51b50c2995 ("PCI/MSI: Export MSI mode using attributes, not kobjects")
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Bjorn Helgaas <helgaas(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/87sfw5305m.ffs@tglx
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 70433013897b..48e3f4e47b29 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -368,6 +368,11 @@ static void free_msi_irqs(struct pci_dev *dev)
for (i = 0; i < entry->nvec_used; i++)
BUG_ON(irq_has_action(entry->irq + i));
+ if (dev->msi_irq_groups) {
+ msi_destroy_sysfs(&dev->dev, dev->msi_irq_groups);
+ dev->msi_irq_groups = NULL;
+ }
+
pci_msi_teardown_msi_irqs(dev);
list_for_each_entry_safe(entry, tmp, msi_list, list) {
@@ -379,11 +384,6 @@ static void free_msi_irqs(struct pci_dev *dev)
list_del(&entry->list);
free_msi_entry(entry);
}
-
- if (dev->msi_irq_groups) {
- msi_destroy_sysfs(&dev->dev, dev->msi_irq_groups);
- dev->msi_irq_groups = NULL;
- }
}
static void pci_intx_for_msi(struct pci_dev *dev, int enable)
Please apply this patch to the stable kernels up to v5.15.
It's basically upstream commit 3ec18fc7831e7d79e2d536dd1f3bc0d3ba425e8a,
adjusted so that it applies to the stable kernels.
It requires that upstream commit 8779e05ba8aaffec1829872ef9774a71f44f6580
is applied before, which shouldn't be a problem as it was tagged for
stable series in the original commmit already.
Thanks,
Helge
--------
From: Sven Schnelle <svens(a)stackframe.org>
Date: Sat, 13 Nov 2021 20:41:17 +0100
Subject: [PATCH] parisc/entry: fix trace test in syscall exit path
Upstream commit: 3ec18fc7831e7d79e2d536dd1f3bc0d3ba425e8a
commit 8779e05ba8aa ("parisc: Fix ptrace check on syscall return")
fixed testing of TI_FLAGS. This uncovered a bug in the test mask.
syscall_restore_rfi is only used when the kernel needs to exit to
usespace with single or block stepping and the recovery counter
enabled. The test however used _TIF_SYSCALL_TRACE_MASK, which
includes a lot of bits that shouldn't be tested here.
Fix this by using TIF_SINGLESTEP and TIF_BLOCKSTEP directly.
I encountered this bug by enabling syscall tracepoints. Both in qemu and
on real hardware. As soon as i enabled the tracepoint (sys_exit_read,
but i guess it doesn't really matter which one), i got random page
faults in userspace almost immediately.
Signed-off-by: Sven Schnelle <svens(a)stackframe.org>
Signed-off-by: Helge Deller <deller(a)gmx.de>
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index 2716e58b498b..437c8d31f390 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -1835,7 +1835,7 @@ syscall_restore:
/* Are we being ptraced? */
LDREG TI_FLAGS-THREAD_SZ_ALGN-FRAME_SIZE(%r30),%r19
- ldi _TIF_SYSCALL_TRACE_MASK,%r2
+ ldi _TIF_SINGLESTEP|_TIF_BLOCKSTEP,%r2
and,COND(=) %r19,%r2,%r0
b,n syscall_restore_rfi
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4030a6e6a6a4a42ff8c18414c9e0c93e24cc70b8 Mon Sep 17 00:00:00 2001
From: Paul Burton <paulburton(a)google.com>
Date: Thu, 1 Jul 2021 10:24:07 -0700
Subject: [PATCH] tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
Currently tgid_map is sized at PID_MAX_DEFAULT entries, which means that
on systems where pid_max is configured higher than PID_MAX_DEFAULT the
ftrace record-tgid option doesn't work so well. Any tasks with PIDs
higher than PID_MAX_DEFAULT are simply not recorded in tgid_map, and
don't show up in the saved_tgids file.
In particular since systemd v243 & above configure pid_max to its
highest possible 1<<22 value by default on 64 bit systems this renders
the record-tgids option of little use.
Increase the size of tgid_map to the configured pid_max instead,
allowing it to cover the full range of PIDs up to the maximum value of
PID_MAX_LIMIT if the system is configured that way.
On 64 bit systems with pid_max == PID_MAX_LIMIT this will increase the
size of tgid_map from 256KiB to 16MiB. Whilst this 64x increase in
memory overhead sounds significant 64 bit systems are presumably best
placed to accommodate it, and since tgid_map is only allocated when the
record-tgid option is actually used presumably the user would rather it
spends sufficient memory to actually record the tgids they expect.
The size of tgid_map could also increase for CONFIG_BASE_SMALL=y
configurations, but these seem unlikely to be systems upon which people
are both configuring a large pid_max and running ftrace with record-tgid
anyway.
Of note is that we only allocate tgid_map once, the first time that the
record-tgid option is enabled. Therefore its size is only set once, to
the value of pid_max at the time the record-tgid option is first
enabled. If a user increases pid_max after that point, the saved_tgids
file will not contain entries for any tasks with pids beyond the earlier
value of pid_max.
Link: https://lkml.kernel.org/r/20210701172407.889626-2-paulburton@google.com
Fixes: d914ba37d714 ("tracing: Add support for recording tgid of tasks")
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Joel Fernandes <joelaf(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Paul Burton <paulburton(a)google.com>
[ Fixed comment coding style ]
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4843076d67d3..14f56e9fa001 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2191,8 +2191,15 @@ void tracing_reset_all_online_cpus(void)
}
}
+/*
+ * The tgid_map array maps from pid to tgid; i.e. the value stored at index i
+ * is the tgid last observed corresponding to pid=i.
+ */
static int *tgid_map;
+/* The maximum valid index into tgid_map. */
+static size_t tgid_map_max;
+
#define SAVED_CMDLINES_DEFAULT 128
#define NO_CMDLINE_MAP UINT_MAX
static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED;
@@ -2468,24 +2475,41 @@ void trace_find_cmdline(int pid, char comm[])
preempt_enable();
}
+static int *trace_find_tgid_ptr(int pid)
+{
+ /*
+ * Pairs with the smp_store_release in set_tracer_flag() to ensure that
+ * if we observe a non-NULL tgid_map then we also observe the correct
+ * tgid_map_max.
+ */
+ int *map = smp_load_acquire(&tgid_map);
+
+ if (unlikely(!map || pid > tgid_map_max))
+ return NULL;
+
+ return &map[pid];
+}
+
int trace_find_tgid(int pid)
{
- if (unlikely(!tgid_map || !pid || pid > PID_MAX_DEFAULT))
- return 0;
+ int *ptr = trace_find_tgid_ptr(pid);
- return tgid_map[pid];
+ return ptr ? *ptr : 0;
}
static int trace_save_tgid(struct task_struct *tsk)
{
+ int *ptr;
+
/* treat recording of idle task as a success */
if (!tsk->pid)
return 1;
- if (unlikely(!tgid_map || tsk->pid > PID_MAX_DEFAULT))
+ ptr = trace_find_tgid_ptr(tsk->pid);
+ if (!ptr)
return 0;
- tgid_map[tsk->pid] = tsk->tgid;
+ *ptr = tsk->tgid;
return 1;
}
@@ -5225,6 +5249,8 @@ int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set)
int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
{
+ int *map;
+
if ((mask == TRACE_ITER_RECORD_TGID) ||
(mask == TRACE_ITER_RECORD_CMD))
lockdep_assert_held(&event_mutex);
@@ -5247,10 +5273,19 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
trace_event_enable_cmd_record(enabled);
if (mask == TRACE_ITER_RECORD_TGID) {
- if (!tgid_map)
- tgid_map = kvcalloc(PID_MAX_DEFAULT + 1,
- sizeof(*tgid_map),
- GFP_KERNEL);
+ if (!tgid_map) {
+ tgid_map_max = pid_max;
+ map = kvcalloc(tgid_map_max + 1, sizeof(*tgid_map),
+ GFP_KERNEL);
+
+ /*
+ * Pairs with smp_load_acquire() in
+ * trace_find_tgid_ptr() to ensure that if it observes
+ * the tgid_map we just allocated then it also observes
+ * the corresponding tgid_map_max value.
+ */
+ smp_store_release(&tgid_map, map);
+ }
if (!tgid_map) {
tr->trace_flags &= ~TRACE_ITER_RECORD_TGID;
return -ENOMEM;
@@ -5664,18 +5699,14 @@ static void *saved_tgids_next(struct seq_file *m, void *v, loff_t *pos)
{
int pid = ++(*pos);
- if (pid > PID_MAX_DEFAULT)
- return NULL;
-
- return &tgid_map[pid];
+ return trace_find_tgid_ptr(pid);
}
static void *saved_tgids_start(struct seq_file *m, loff_t *pos)
{
- if (!tgid_map || *pos > PID_MAX_DEFAULT)
- return NULL;
+ int pid = *pos;
- return &tgid_map[*pos];
+ return trace_find_tgid_ptr(pid);
}
static void saved_tgids_stop(struct seq_file *m, void *v)
From: Meng Li <meng.li(a)windriver.com>
In stable kernel v5.10, when run below command to remove ethernet driver on
stratix10 platform, there will be warning trace as below:
$ cd /sys/class/net/eth0/device/driver/
$ echo ff800000.ethernet > unbind
WARNING: CPU: 3 PID: 386 at drivers/clk/clk.c:810 clk_core_unprepare+0x114/0x274
Modules linked in: sch_fq_codel
CPU: 3 PID: 386 Comm: sh Tainted: G W 5.10.74-yocto-standard #1
Hardware name: SoCFPGA Stratix 10 SoCDK (DT)
pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
pc : clk_core_unprepare+0x114/0x274
lr : clk_core_unprepare+0x114/0x274
sp : ffff800011bdbb10
clk_core_unprepare+0x114/0x274
clk_unprepare+0x38/0x50
stmmac_remove_config_dt+0x40/0x80
stmmac_pltfr_remove+0x64/0x80
platform_drv_remove+0x38/0x60
... ..
el0_sync_handler+0x1a4/0x1b0
el0_sync+0x180/0x1c0
This issue is introduced by introducing upstream commit 8f269102baf7
("net: stmmac: disable clocks in stmmac_remove_config_dt()")
But in latest mainline kernel, there is no this issue. Because commit
5ec55823438e("net: stmmac: add clocks management for gmac driver") and its
folowing fixing commits improved clocks management for stmmac driver.
Therefore, backport them to stable kernel v5.10.
Joakim Zhang (2):
net: stmmac: add clocks management for gmac driver
net: stmmac: fix system hang if change mac address after interface
ifdown
Michael Riesch (1):
net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings
Wei Yongjun (1):
net: stmmac: platform: fix build error with !CONFIG_PM_SLEEP
Wong Vee Khee (1):
net: stmmac: fix issue where clk is being unprepared twice
Yang Yingliang (1):
net: stmmac: fix missing unlock on error in stmmac_suspend()
.../net/ethernet/stmicro/stmmac/dwmac-rk.c | 9 --
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 1 +
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 87 ++++++++++++--
.../net/ethernet/stmicro/stmmac/stmmac_mdio.c | 111 ++++++++++++++----
.../ethernet/stmicro/stmmac/stmmac_platform.c | 30 ++++-
5 files changed, 187 insertions(+), 51 deletions(-)
--
2.17.1
The patch below does not apply to the 5.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 1ae43851b18afe861120ebd7c426dc44f06bb2bd Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Thu, 16 Sep 2021 15:23:12 +0900
Subject: [PATCH] bootconfig: init: Fix memblock leak in xbc_make_cmdline()
Free unused memblock in a error case to fix memblock leak
in xbc_make_cmdline().
Link: https://lkml.kernel.org/r/163177339181.682366.8713781325929549256.stgit@dev…
Fixes: 51887d03aca1 ("bootconfig: init: Allow admin to use bootconfig for kernel command line")
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/init/main.c b/init/main.c
index 81a79a77db46..3c4054a95545 100644
--- a/init/main.c
+++ b/init/main.c
@@ -382,6 +382,7 @@ static char * __init xbc_make_cmdline(const char *key)
ret = xbc_snprint_cmdline(new_cmdline, len + 1, root);
if (ret < 0 || ret > len) {
pr_err("Failed to print extra kernel cmdline.\n");
+ memblock_free_ptr(new_cmdline, len + 1);
return NULL;
}