Hi Greg,
Our internal rpm build for aarch64 started failing after merging
v5.15.60. The build actually succeeds, but packaging failed due to tools
which extract the Build ID from vmlinux.
Expected:
readelf -n vmlinux
Displaying notes found in: .notes
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: ae5803ded69a15fb0a75905ec226a76cc1823d22
Linux 0x00000004 func
description data: 00 00 00 00
Linux 0x00000001 OPEN
description data: 00
Actual (post v5.15.60 merge):
readelf -n vmlinux
<no output>
Digging deeper...
Expected:
readelf --headers vmlinux | sed -ne '/Program Header/,$p'
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000010000 0xffff800008000000 0xffff800008000000
0x0000000002012a00 0x000000000208f724 RWE 0x10000
NOTE 0x0000000001705948 0xffff8000096f5948 0xffff8000096f5948
0x0000000000000054 0x0000000000000054 R 0x4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .head.text .text .got.plt .rodata .pci_fixup __ksymtab __ksymtab_gpl __kcrctab __kcrctab_gpl __ksymtab_strings __param __modver __ex_table .notes .hyp.rodata .init.text .exit.text .altinstructions .init.data .data..percpu .hyp.data..percpu .hyp.reloc .rela.dyn .data __bug_table .mmuoff.data.write .mmuoff.data.read .pecoff_edata_padding .bss
01 .notes
02
Actual:
readelf --headers vmlinux | sed -ne '/Program Header/,$p'
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000010000 0xffff800008000000 0xffff800008000000
0x0000000002012a00 0x000000000208f724 RWE 0x10000
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .head.text .text .got.plt .rodata .pci_fixup __ksymtab __ksymtab_gpl __kcrctab __kcrctab_gpl __ksymtab_strings __param __modver __ex_table .notes .hyp.rodata .init.text .exit.text .altinstructions .init.data .data..percpu .hyp.data..percpu .hyp.reloc .rela.dyn .data __bug_table .mmuoff.data.write .mmuoff.data.read .pecoff_edata_padding .bss
01
readelf -n fails since NOTE segment is missing from vmlinux.
Why?
5.15 Commit: 4c7ee827da2c ("Makefile: link with -z noexecstack --no-warn-rwx-segments")
surfaced this issue.
1. Reverting this patch fixed this issue.
2. Removing all .note.GNU-stack sections in cmd_modversions_S also worked:
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 17aa8ef2d52a..b95cfbb43cee 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -379,6 +379,8 @@ cmd_modversions_S = \
if $(OBJDUMP) -h $@ | grep -q __ksymtab; then \
$(call cmd_gensymtypes_S,$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \
> $(@D)/.tmp_$(@F:.o=.ver); \
+ echo "SECTIONS { /DISCARD/ : { *(.note.GNU-stack) }}" \
+ >> $(@D)/.tmp_$(@F:.o=.ver); \
\
$(LD) $(KBUILD_LDFLAGS) -r -o $(@D)/.tmp_$(@F) $@ \
-T $(@D)/.tmp_$(@F:.o=.ver); \
3. But the simplest fix I found was to remove -z noexecstack just for
head.S (this patch).
This issue exists, in my testing of arm64 builds, for stable versions
5.15.30+, 5.10.136+, and 5.4.210+
and reproduces with every attempted combination of gcc{11,12}
ld{2.36,2.37,2.38,2.39}.
It can also be reproduced upstream by cherry-picking 0d362be5b142 on-top until
7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS")
which was part of merge
df202b452fe6 ("Merge tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild")
It seems kernels prior to v5.4 and 5.19+ use differing KBUILD rules
(with CONFIG_MODVERSIONS=y) for linking components of vmlinux, and in my
testing these other versions did not encounter this issue.
Other architectures besides arm64 might also have similar bug/feature of ld.
Though, x86_64 v5.15.60 worked as expected, so no further experiments
were performed for x86_64.
arm64 repro test:
make ARCH=arm64 mrproper
cp arch/arm64/configs/defconfig ".config"
scripts/config -e MODVERSIONS
make ARCH=arm64 olddefconfig
make ARCH=arm64 -j16 vmlinux
readelf -n vmlinux | grep -F "Build ID:"
Please consider applying for 5.15, 5.10, and 5.4.
Regards,
--Tom
Tom Saeger (1):
arm64: fix Build ID if CONFIG_MODVERSIONS
arch/arm64/kernel/Makefile | 5 +++++
1 file changed, 5 insertions(+)
base-commit: e4a7232c917cd1b56d5b4fa9d7a23e3eabfecba0
--
2.38.1
Analogue to commit 8aa59e355949 ("can: af_can: fix NULL pointer
dereference in can_rx_register()") we need to check for a missing
initialization of ml_priv in the receive path of CAN frames.
Since commit 4e096a18867a ("net: introduce CAN specific pointer in the
struct net_device") the check for dev->type to be ARPHRD_CAN is not
sufficient anymore since bonding or tun netdevices claim to be CAN
devices but do not initialize ml_priv accordingly.
Upstream commit 0acc442309a0 ("can: af_can: fix NULL pointer
dereference in can_rcv_filter")
Fixes: 4e096a18867a ("net: introduce CAN specific pointer in the struct net_device")
Reported-by: syzbot+2d7f58292cb5b29eb5ad(a)syzkaller.appspotmail.com
Reported-by: Wei Chen <harperchen1110(a)gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan(a)hartkopp.net>
Cc: stable(a)vger.kernel.org # 5.12 .. 6.0
---
net/can/af_can.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 1fb49d51b25d..4392f1d9c027 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -678,11 +678,11 @@ static void can_receive(struct sk_buff *skb, struct net_device *dev)
static int can_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *orig_dev)
{
struct canfd_frame *cfd = (struct canfd_frame *)skb->data;
- if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU)) {
+ if (unlikely(dev->type != ARPHRD_CAN || !can_get_ml_priv(dev) || skb->len != CAN_MTU)) {
pr_warn_once("PF_CAN: dropped non conform CAN skbuff: dev type %d, len %d\n",
dev->type, skb->len);
goto free_skb;
}
@@ -704,11 +704,11 @@ static int can_rcv(struct sk_buff *skb, struct net_device *dev,
static int canfd_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *orig_dev)
{
struct canfd_frame *cfd = (struct canfd_frame *)skb->data;
- if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU)) {
+ if (unlikely(dev->type != ARPHRD_CAN || !can_get_ml_priv(dev) || skb->len != CANFD_MTU)) {
pr_warn_once("PF_CAN: dropped non conform CAN FD skbuff: dev type %d, len %d\n",
dev->type, skb->len);
goto free_skb;
}
--
2.30.2
The following commit has been merged into the timers/core branch of tip:
Commit-ID: 45ae272a948a03a7d55748bf52d2f47d3b4e1d5a
Gitweb: https://git.kernel.org/tip/45ae272a948a03a7d55748bf52d2f47d3b4e1d5a
Author: Joe Korty <joe.korty(a)concurrent-rt.com>
AuthorDate: Mon, 21 Nov 2022 14:53:43
Committer: Daniel Lezcano <daniel.lezcano(a)kernel.org>
CommitterDate: Fri, 02 Dec 2022 12:48:28 +01:00
clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
The TVAL register is 32 bit signed. Thus only the lower 31 bits are
available to specify when an interrupt is to occur at some time in the
near future. Attempting to specify a larger interval with TVAL results
in a negative time delta which means the timer fires immediately upon
being programmed, rather than firing at that expected future time.
The solution is for Linux to declare that TVAL is a 31 bit register rather
than give its true size of 32 bits. This prevents Linux from programming
TVAL with a too-large value. Note that, prior to 5.16, this little trick
was the standard way to handle TVAL in Linux, so there is nothing new
happening here on that front.
The softlockup detector hides the issue, because it keeps generating
short timer deadlines that are within the scope of the broken timer.
Disable it, and you start using NO_HZ with much longer timer deadlines,
which turns into an interrupt flood:
11: 1124855130 949168462 758009394 76417474 104782230 30210281
310890 1734323687 GICv2 29 Level arch_timer
And "much longer" isn't that long: it takes less than 43s to underflow
TVAL at 50MHz (the frequency of the counter on XGene-1).
Some comments on the v1 version of this patch by Marc Zyngier:
XGene implements CVAL (a 64bit comparator) in terms of TVAL (a countdown
register) instead of the other way around. TVAL being a 32bit register,
the width of the counter should equally be 32. However, TVAL is a
*signed* value, and keeps counting down in the negative range once the
timer fires.
It means that any TVAL value with bit 31 set will fire immediately,
as it cannot be distinguished from an already expired timer. Reducing
the timer range back to a paltry 31 bits papers over the issue.
Another problem cannot be fixed though, which is that the timer interrupt
*must* be handled within the negative countdown period, or the interrupt
will be lost (TVAL will rollover to a positive value, indicative of a
new timer deadline).
Cc: stable(a)vger.kernel.org # 5.16+
Fixes: 012f18850452 ("clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations")
Signed-off-by: Joe Korty <joe.korty(a)concurrent-rt.com>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
[maz: revamped the commit message]
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Link: https://lore.kernel.org/r/20221024165422.GA51107@zipoli.concurrent-rt.com
Link: https://lore.kernel.org/r/20221121145343.896018-1-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)kernel.org>
---
drivers/clocksource/arm_arch_timer.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 9c3420a..e2920da 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -806,6 +806,9 @@ static u64 __arch_timer_check_delta(void)
/*
* XGene-1 implements CVAL in terms of TVAL, meaning
* that the maximum timer range is 32bit. Shame on them.
+ *
+ * Note that TVAL is signed, thus has only 31 of its
+ * 32 bits to express magnitude.
*/
MIDR_ALL_VERSIONS(MIDR_CPU_MODEL(ARM_CPU_IMP_APM,
APM_CPU_PART_POTENZA)),
@@ -813,8 +816,8 @@ static u64 __arch_timer_check_delta(void)
};
if (is_midr_in_range_list(read_cpuid_id(), broken_cval_midrs)) {
- pr_warn_once("Broken CNTx_CVAL_EL1, limiting width to 32bits");
- return CLOCKSOURCE_MASK(32);
+ pr_warn_once("Broken CNTx_CVAL_EL1, using 32 bit TVAL instead.\n");
+ return CLOCKSOURCE_MASK(31);
}
#endif
return CLOCKSOURCE_MASK(arch_counter_get_width());