From: Geliang Tang tanggeliang@kylinos.cn
v2: - add patch 2, a new fix for sk_msg_memcopy_from_iter. - update patch 3, only test "sk->sk_prot->close" as Eric suggested. - update patch 4, use "goto err" instead of "return" as Eduard suggested. - add "fixes" tag for patch 1-3. - change subject prefixes as "bpf-next" to trigger BPF CI. - cc Loongarch maintainers too.
BPF selftests seem to have not been fully tested on Loongarch. When I ran these tests on Loongarch recently, some errors occur. This patch set contains some null-check related fixes for these errors.
Geliang Tang (4): skmsg: null check for sg_page in sk_msg_recvmsg skmsg: null check for sg_page in sk_msg_memcopy_from_iter inet: null check for close in inet_release selftests/bpf: Null checks for link in bpf_tcp_ca
net/core/skmsg.c | 4 ++++ net/ipv4/af_inet.c | 3 ++- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 16 ++++++++++++---- 3 files changed, 18 insertions(+), 5 deletions(-)
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_basic
A Kernel panic occurs:
''' Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 9000000004162774 ra 90000000048bf6c0 tp 90001000aa16c000 sp 90001000aa16fb90 a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 90001000aa16fd70 a4 0000000000000800 a5 0000000000000000 a6 000055557b63aae8 a7 00000000000000cf t0 0000000000000000 t1 0000000000004000 t2 0000000000000048 t3 0000000000000000 t4 0000000000000001 t5 0000000000000002 t6 0000000000000001 t7 0000000000000002 t8 0000000000000018 u0 9000000004856150 s9 0000000000000000 s0 0000000000000000 s1 0000000000000000 s2 90001000aa16fd70 s3 0000000000000000 s4 0000000000000000 s5 0000000000004000 s6 900010009284dc00 s7 0000000000000001 s8 900010009284dc00 ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=000000001cba0874) Stack : 0000000000000001 fffffffffffffffc 0000000000000000 0000000000000000 0000000000000018 0000000000000000 0000000000000000 90000000048bf6c0 90000000052cd638 90001000aa16fd70 900010008bf51580 900010009284f000 90000000049f2b90 900010009284f188 900010009284f178 90001000861d4780 9000100084dccd00 0000000000000800 0000000000000007 fffffffffffffff2 000000000453e92f 90000000049aae34 90001000aa16fd60 900010009284f000 0000000000000000 0000000000000000 900010008bf51580 90000000049f2b90 0000000000000001 0000000000000000 9000100084dc3a10 900010009284f1ac 90001000aa16fd40 0000555559953278 0000000000000001 0000000000000000 90001000aa16fdc8 9000000005a5a000 90001000861d4780 0000000000000800 ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160
Code: 0010b09b 440125a0 0011df8d <28c10364> 0012b70c 00133305 0013b1ac 0010dc84 00151585
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sg_page(sge)" is NULL in that case. This patch adds null check for it in sk_msg_recvmsg() to fix this error.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn --- net/core/skmsg.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..bafcc1e2eadf 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -432,6 +432,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, sge = sk_msg_elem(msg_rx, i); copy = sge->length; page = sg_page(sge); + if (!page) + goto out; if (copied + copy > len) copy = len - copied; copy = copy_page_to_iter(page, sge->offset, copy, iter);
On Tue, Jun 25, 2024 at 10:25 AM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_basic
A Kernel panic occurs:
''' Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 9000000004162774 ra 90000000048bf6c0 tp 90001000aa16c000 sp 90001000aa16fb90 a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 90001000aa16fd70 a4 0000000000000800 a5 0000000000000000 a6 000055557b63aae8 a7 00000000000000cf t0 0000000000000000 t1 0000000000004000 t2 0000000000000048 t3 0000000000000000 t4 0000000000000001 t5 0000000000000002 t6 0000000000000001 t7 0000000000000002 t8 0000000000000018 u0 9000000004856150 s9 0000000000000000 s0 0000000000000000 s1 0000000000000000 s2 90001000aa16fd70 s3 0000000000000000 s4 0000000000000000 s5 0000000000004000 s6 900010009284dc00 s7 0000000000000001 s8 900010009284dc00 ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=000000001cba0874) Stack : 0000000000000001 fffffffffffffffc 0000000000000000 0000000000000000 0000000000000018 0000000000000000 0000000000000000 90000000048bf6c0 90000000052cd638 90001000aa16fd70 900010008bf51580 900010009284f000 90000000049f2b90 900010009284f188 900010009284f178 90001000861d4780 9000100084dccd00 0000000000000800 0000000000000007 fffffffffffffff2 000000000453e92f 90000000049aae34 90001000aa16fd60 900010009284f000 0000000000000000 0000000000000000 900010008bf51580 90000000049f2b90 0000000000000001 0000000000000000 9000100084dc3a10 900010009284f1ac 90001000aa16fd40 0000555559953278 0000000000000001 0000000000000000 90001000aa16fdc8 9000000005a5a000 90001000861d4780 0000000000000800 ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160
Code: 0010b09b 440125a0 0011df8d <28c10364> 0012b70c 00133305 0013b1ac 0010dc84 00151585
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sg_page(sge)" is NULL in that case. This patch adds null check for it in sk_msg_recvmsg() to fix this error.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn
net/core/skmsg.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..bafcc1e2eadf 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -432,6 +432,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, sge = sk_msg_elem(msg_rx, i); copy = sge->length; page = sg_page(sge);
if (!page)
goto out; if (copied + copy > len) copy = len - copied; copy = copy_page_to_iter(page, sge->offset, copy, iter);
-- 2.43.0
This looks pretty much random to me.
Please find the root cause, instead of desperately trying to fix 'tests'.
Eric Dumazet wrote:
On Tue, Jun 25, 2024 at 10:25 AM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_basic
A Kernel panic occurs:
''' Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 9000000004162774 ra 90000000048bf6c0 tp 90001000aa16c000 sp 90001000aa16fb90 a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 90001000aa16fd70 a4 0000000000000800 a5 0000000000000000 a6 000055557b63aae8 a7 00000000000000cf t0 0000000000000000 t1 0000000000004000 t2 0000000000000048 t3 0000000000000000 t4 0000000000000001 t5 0000000000000002 t6 0000000000000001 t7 0000000000000002 t8 0000000000000018 u0 9000000004856150 s9 0000000000000000 s0 0000000000000000 s1 0000000000000000 s2 90001000aa16fd70 s3 0000000000000000 s4 0000000000000000 s5 0000000000004000 s6 900010009284dc00 s7 0000000000000001 s8 900010009284dc00 ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=000000001cba0874) Stack : 0000000000000001 fffffffffffffffc 0000000000000000 0000000000000000 0000000000000018 0000000000000000 0000000000000000 90000000048bf6c0 90000000052cd638 90001000aa16fd70 900010008bf51580 900010009284f000 90000000049f2b90 900010009284f188 900010009284f178 90001000861d4780 9000100084dccd00 0000000000000800 0000000000000007 fffffffffffffff2 000000000453e92f 90000000049aae34 90001000aa16fd60 900010009284f000 0000000000000000 0000000000000000 900010008bf51580 90000000049f2b90 0000000000000001 0000000000000000 9000100084dc3a10 900010009284f1ac 90001000aa16fd40 0000555559953278 0000000000000001 0000000000000000 90001000aa16fdc8 9000000005a5a000 90001000861d4780 0000000000000800 ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160
Code: 0010b09b 440125a0 0011df8d <28c10364> 0012b70c 00133305 0013b1ac 0010dc84 00151585
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sg_page(sge)" is NULL in that case. This patch adds null check for it in sk_msg_recvmsg() to fix this error.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn
net/core/skmsg.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..bafcc1e2eadf 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -432,6 +432,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, sge = sk_msg_elem(msg_rx, i); copy = sge->length; page = sg_page(sge);
if (!page)
goto out; if (copied + copy > len) copy = len - copied; copy = copy_page_to_iter(page, sge->offset, copy, iter);
-- 2.43.0
This looks pretty much random to me.
Please find the root cause, instead of desperately trying to fix 'tests'.
If this happens then either we put a bad msg_rx on the queue see a few lines up and we need to sort out why that msg_rx was built. Or we walked off the end of a scatter gather list and need to see why this test isn't sufficient?
} while ((i != msg_rx->sg.end) && !sg_is_last(sge))
is this happening every time you run the command or did you run this for a long iteration and eventually hit this? I don't see why this would be specific to your arch though.
On Tue, 2024-06-25 at 12:37 -0700, John Fastabend wrote:
Eric Dumazet wrote:
On Tue, Jun 25, 2024 at 10:25 AM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_basic
A Kernel panic occurs:
''' Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 9000000004162774 ra 90000000048bf6c0 tp 90001000aa16c000 sp 90001000aa16fb90 a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 90001000aa16fd70 a4 0000000000000800 a5 0000000000000000 a6 000055557b63aae8 a7 00000000000000cf t0 0000000000000000 t1 0000000000004000 t2 0000000000000048 t3 0000000000000000 t4 0000000000000001 t5 0000000000000002 t6 0000000000000001 t7 0000000000000002 t8 0000000000000018 u0 9000000004856150 s9 0000000000000000 s0 0000000000000000 s1 0000000000000000 s2 90001000aa16fd70 s3 0000000000000000 s4 0000000000000000 s5 0000000000004000 s6 900010009284dc00 s7 0000000000000001 s8 900010009284dc00 ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=000000001cba0874) Stack : 0000000000000001 fffffffffffffffc 0000000000000000 0000000000000000 0000000000000018 0000000000000000 0000000000000000 90000000048bf6c0 90000000052cd638 90001000aa16fd70 900010008bf51580 900010009284f000 90000000049f2b90 900010009284f188 900010009284f178 90001000861d4780 9000100084dccd00 0000000000000800 0000000000000007 fffffffffffffff2 000000000453e92f 90000000049aae34 90001000aa16fd60 900010009284f000 0000000000000000 0000000000000000 900010008bf51580 90000000049f2b90 0000000000000001 0000000000000000 9000100084dc3a10 900010009284f1ac 90001000aa16fd40 0000555559953278 0000000000000001 0000000000000000 90001000aa16fdc8 9000000005a5a000 90001000861d4780 0000000000000800 ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160
Code: 0010b09b 440125a0 0011df8d <28c10364> 0012b70c 00133305 0013b1ac 0010dc84 00151585
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sg_page(sge)" is NULL in that case. This patch adds null check for it in sk_msg_recvmsg() to fix this error.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn
net/core/skmsg.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..bafcc1e2eadf 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -432,6 +432,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, sge = sk_msg_elem(msg_rx, i); copy = sge->length; page = sg_page(sge); + if (!page) + goto out; if (copied + copy > len) copy = len - copied; copy = copy_page_to_iter(page, sge-
offset, copy, iter);
-- 2.43.0
This looks pretty much random to me.
Please find the root cause, instead of desperately trying to fix 'tests'.
If this happens then either we put a bad msg_rx on the queue see a few lines up and we need to sort out why that msg_rx was built. Or we walked off the end of a scatter gather list and need to see why this test isn't sufficient?
} while ((i != msg_rx->sg.end) && !sg_is_last(sge))
is this happening every time you run the command or did you run this for a long iteration and eventually hit this? I don't see why this would be
This happens every time when run test_sockmap_skb_verdict_shutdown test in sockmap_basic. It hits this null page case on X86_64 platform too.
specific to your arch though.
Kernel panics when a null page is passed to kmap_local_page() on Loongarch only, and this function is an arch specific one. I think this issue is somehow related to Loongarch's memory management.
Thanks, -Geliang
Hi John,
On Wed, 2024-06-26 at 21:05 +0800, Geliang Tang wrote:
On Tue, 2024-06-25 at 12:37 -0700, John Fastabend wrote:
Eric Dumazet wrote:
On Tue, Jun 25, 2024 at 10:25 AM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_basic
A Kernel panic occurs:
''' Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 9000000004162774 ra 90000000048bf6c0 tp 90001000aa16c000 sp 90001000aa16fb90 a0 0000000000000000 a1 0000000000000000 a2 0000000000000000 a3 90001000aa16fd70 a4 0000000000000800 a5 0000000000000000 a6 000055557b63aae8 a7 00000000000000cf t0 0000000000000000 t1 0000000000004000 t2 0000000000000048 t3 0000000000000000 t4 0000000000000001 t5 0000000000000002 t6 0000000000000001 t7 0000000000000002 t8 0000000000000018 u0 9000000004856150 s9 0000000000000000 s0 0000000000000000 s1 0000000000000000 s2 90001000aa16fd70 s3 0000000000000000 s4 0000000000000000 s5 0000000000004000 s6 900010009284dc00 s7 0000000000000001 s8 900010009284dc00 ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=000000001cba0874) Stack : 0000000000000001 fffffffffffffffc 0000000000000000 0000000000000000 0000000000000018 0000000000000000 0000000000000000 90000000048bf6c0 90000000052cd638 90001000aa16fd70 900010008bf51580 900010009284f000 90000000049f2b90 900010009284f188 900010009284f178 90001000861d4780 9000100084dccd00 0000000000000800 0000000000000007 fffffffffffffff2 000000000453e92f 90000000049aae34 90001000aa16fd60 900010009284f000 0000000000000000 0000000000000000 900010008bf51580 90000000049f2b90 0000000000000001 0000000000000000 9000100084dc3a10 900010009284f1ac 90001000aa16fd40 0000555559953278 0000000000000001 0000000000000000 90001000aa16fdc8 9000000005a5a000 90001000861d4780 0000000000000800 ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160
Code: 0010b09b 440125a0 0011df8d <28c10364> 0012b70c 00133305 0013b1ac 0010dc84 00151585
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sg_page(sge)" is NULL in that case. This patch adds null check for it in sk_msg_recvmsg() to fix this error.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn
net/core/skmsg.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..bafcc1e2eadf 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -432,6 +432,8 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, sge = sk_msg_elem(msg_rx, i); copy = sge->length; page = sg_page(sge); + if (!page) + goto out; if (copied + copy > len) copy = len - copied; copy = copy_page_to_iter(page, sge-
offset, copy, iter);
-- 2.43.0
This looks pretty much random to me.
Please find the root cause, instead of desperately trying to fix 'tests'.
If this happens then either we put a bad msg_rx on the queue see a few lines up and we need to sort out why that msg_rx was built. Or we walked
I think I have figured out the issue. It's caused by this, an empty skb (skb->len == 0) is put on the queue.
In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list.
And in sk_msg_recvmsg(), this empty sge is dequeued, and a NULL page is got by sg_page(sge). Pass this NULL-page to copy_page_to_iter(), then kernel panics.
To solve this, I think we should prevent empty skb from putting on the queue. My new modification is as follows:
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..44952cdd1425 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -1184,7 +1184,7 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
rcu_read_lock(); psock = sk_psock(sk); - if (unlikely(!psock)) { + if (unlikely(!psock || !len)) { len = 0; tcp_eat_skb(sk, skb); sock_drop(sk, skb);
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_sockmap
A Kernel panic occurs:
''' Oops[#1]: CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G OE 6.10.0-rc2+ #32 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 900000000426cd1c ra 90000000043a315c tp 900010008bfbc000 sp 900010008bfbf8a0 a0 ffffffffffffffe4 a1 900010008bfbfe20 a2 9000100089cd9400 a3 0000000000000003 a4 900010008bfbfb80 a5 900010008bfbfe20 a6 0000000000000000 a7 00000000000000d3 t0 0000000000000000 t1 0000000000000000 t2 0000000000008000 t3 0000000000000000 t4 0000000000000000 t5 0000000000000000 t6 0000000000000006 t7 fffffef1fea12c80 t8 fffffffffffffffc u0 0000000400000005 s9 0000000000000003 s0 0000000000000000 s1 0000000000000012 s2 900010008b9bbc00 s3 0000000000000018 s4 0000020000000000 s5 fffffffffffffffc s6 000000007fffffff s7 0000000000000002 s8 9000100089cd9400 ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420 ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043, task=00000000781bb2f1) Stack : 0000000000000000 900010008bfbfe20 0000000000000000 0000000000000003 0000000000000000 900010008bfbf94c 900010008bfbf950 0000000000000000 0000000000000003 0000000000000003 900010008bfbfe10 900010008beeb400 9000100089cd9400 0000000000000003 900010008b9bbc00 90000000043a315c 0000000000084000 900010008bfbfe20 900010008bfbf958 900010008beeb5ac 900010087fffd500 0000000000000000 7fffffffffffffff 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ... Call Trace: [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220 [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420 [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0 [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360 [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120 [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100 [<90000000045b76ec>] do_syscall+0x8c/0xc0 [<90000000030e1da4>] handle_syscall+0xc4/0x160
Code: 001532f7 0014f210 001036ed <28c10204> 298043ed 28c8632c 0010c5ef 0010bc84 0014ed8c
---[ end trace 0000000000000000 ]--- '''
This is because "sg_page(sge)" is NULL in that case. This patch adds null check for it in sk_msg_memcopy_from_iter() to fix this error.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn --- net/core/skmsg.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index bafcc1e2eadf..495b18b5dce5 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -375,6 +375,8 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from,
do { sge = sk_msg_elem(msg, i); + if (!sg_page(sge)) + goto out; /* This is possible if a trim operation shrunk the buffer */ if (msg->sg.copybreak >= sge->length) { msg->sg.copybreak = 0;
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_listen
A Kernel panic occurs:
''' Oops[#1]: CPU: 49 PID: 233429 Comm: new_name Tainted: G OE 6.10.0-rc2+ #20 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 0000000000000000 ra 90000000051ea4a0 tp 900030008549c000 sp 900030008549fe00 a0 9000300152524a00 a1 0000000000000000 a2 900030008549fe38 a3 900030008549fe30 a4 900030008549fe30 a5 90003000c58c8d80 a6 0000000000000000 a7 0000000000000039 t0 0000000000000000 t1 90003000c58c8d80 t2 0000000000000001 t3 0000000000000000 t4 0000000000000001 t5 900000011a1bf580 t6 900000011a3aff60 t7 000000000000006b t8 00000fffffffffff u0 0000000000000000 s9 00007fffbbe9e930 s0 9000300152524a00 s1 90003000c58c8d00 s2 9000000006c81568 s3 0000000000000000 s4 90003000c58c8d80 s5 00007ffff236a000 s6 00007ffffbc292b0 s7 00007ffffbc29998 s8 00007fffbbe9f180 ra: 90000000051ea4a0 inet_release+0x60/0xc0 ERA: 0000000000000000 0x0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00030000 [PIF] (IS= ECode=3 EsubCode=0) BADV: 0000000000000000 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp Process new_name (pid: 233429, threadinfo=00000000b9196405, task=00000000c01df45b) Stack : 0000000000000000 90003000c58c8e20 90003000c58c8d00 900000000505960c 0000000000000000 9000000101c6ad20 9000300086524540 00000000082e0003 900030008bf57400 90000000050596bc 900030008bf57400 900000000434acac 0000000000000016 00007ffff224e060 00007fffbbe9f180 900030008bf57400 0000000000000000 9000000004341ce0 00007fffbbe9f180 00007ffff2369000 900030008549fec0 90000000054476ec 000000000000006b 9000000003f71da4 000000000000003a 00007ffff22b8a44 00007fffbbe9f8e0 00007fffbbe9e680 ffffffffffffffda 0000000000000000 0000000000000000 0000000000000000 00007fffbbe9f288 0000000000000000 0000000000000000 0000000000000039 84c2431493ceab6e 84c23ceb2827425e 0000000000000007 00007ffff2271600 ... Call Trace: [<900000000505960c>] __sock_release+0x4c/0xe0 [<90000000050596bc>] sock_close+0x1c/0x40 [<900000000434acac>] __fput+0xec/0x2e0 [<9000000004341ce0>] sys_close+0x40/0xa0 [<90000000054476ec>] do_syscall+0x8c/0xc0 [<9000000003f71da4>] handle_syscall+0xc4/0x160
Code: (Bad address in era)
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3d50000 .text @ 0x9000000003f50000 .data @ 0x90000000055b0000 .bss @ 0x9000000006ca9400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sk->sk_prot->close" pointer is NULL in that case. This patch adds null check for it in inet_release() to fix this error.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn --- net/ipv4/af_inet.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index b24d74616637..34a719e98c69 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -434,7 +434,8 @@ int inet_release(struct socket *sock) if (sock_flag(sk, SOCK_LINGER) && !(current->flags & PF_EXITING)) timeout = sk->sk_lingertime; - sk->sk_prot->close(sk, timeout); + if (sk->sk_prot->close) + sk->sk_prot->close(sk, timeout); sock->sk = NULL; } return 0;
On Tue, Jun 25, 2024 at 10:25 AM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
Run the following BPF selftests on Loongarch:
./test_progs -t sockmap_listen
A Kernel panic occurs:
''' Oops[#1]: CPU: 49 PID: 233429 Comm: new_name Tainted: G OE 6.10.0-rc2+ #20 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018-V4.0.11 pc 0000000000000000 ra 90000000051ea4a0 tp 900030008549c000 sp 900030008549fe00 a0 9000300152524a00 a1 0000000000000000 a2 900030008549fe38 a3 900030008549fe30 a4 900030008549fe30 a5 90003000c58c8d80 a6 0000000000000000 a7 0000000000000039 t0 0000000000000000 t1 90003000c58c8d80 t2 0000000000000001 t3 0000000000000000 t4 0000000000000001 t5 900000011a1bf580 t6 900000011a3aff60 t7 000000000000006b t8 00000fffffffffff u0 0000000000000000 s9 00007fffbbe9e930 s0 9000300152524a00 s1 90003000c58c8d00 s2 9000000006c81568 s3 0000000000000000 s4 90003000c58c8d80 s5 00007ffff236a000 s6 00007ffffbc292b0 s7 00007ffffbc29998 s8 00007fffbbe9f180 ra: 90000000051ea4a0 inet_release+0x60/0xc0 ERA: 0000000000000000 0x0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000000 (-FPE -SXE -ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00030000 [PIF] (IS= ECode=3 EsubCode=0) BADV: 0000000000000000 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp Process new_name (pid: 233429, threadinfo=00000000b9196405, task=00000000c01df45b) Stack : 0000000000000000 90003000c58c8e20 90003000c58c8d00 900000000505960c 0000000000000000 9000000101c6ad20 9000300086524540 00000000082e0003 900030008bf57400 90000000050596bc 900030008bf57400 900000000434acac 0000000000000016 00007ffff224e060 00007fffbbe9f180 900030008bf57400 0000000000000000 9000000004341ce0 00007fffbbe9f180 00007ffff2369000 900030008549fec0 90000000054476ec 000000000000006b 9000000003f71da4 000000000000003a 00007ffff22b8a44 00007fffbbe9f8e0 00007fffbbe9e680 ffffffffffffffda 0000000000000000 0000000000000000 0000000000000000 00007fffbbe9f288 0000000000000000 0000000000000000 0000000000000039 84c2431493ceab6e 84c23ceb2827425e 0000000000000007 00007ffff2271600 ... Call Trace: [<900000000505960c>] __sock_release+0x4c/0xe0 [<90000000050596bc>] sock_close+0x1c/0x40 [<900000000434acac>] __fput+0xec/0x2e0 [<9000000004341ce0>] sys_close+0x40/0xa0 [<90000000054476ec>] do_syscall+0x8c/0xc0 [<9000000003f71da4>] handle_syscall+0xc4/0x160
Code: (Bad address in era)
---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3d50000 .text @ 0x9000000003f50000 .data @ 0x90000000055b0000 .bss @ 0x9000000006ca9400 ---[ end Kernel panic - not syncing: Fatal exception ]--- '''
This is because "sk->sk_prot->close" pointer is NULL in that case. This patch adds null check for it in inet_release() to fix this error.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Geliang Tang tanggeliang@kylinos.cn
net/ipv4/af_inet.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index b24d74616637..34a719e98c69 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -434,7 +434,8 @@ int inet_release(struct socket *sock) if (sock_flag(sk, SOCK_LINGER) && !(current->flags & PF_EXITING)) timeout = sk->sk_lingertime;
sk->sk_prot->close(sk, timeout);
if (sk->sk_prot->close)
sk->sk_prot->close(sk, timeout);
Can you tell us which inet protocol does not have a ->close pointer ?
I find it hard to believe a day-0 bug only hit Loongarch arch in 2024.
From: Geliang Tang tanggeliang@kylinos.cn
Run BPF selftests bpf_tcp_ca on Loongarch:
./test_progs -t bpf_tcp_ca
A "Segmentation fault" error occurs:
''' test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/1 bpf_tcp_ca/dctcp:FAIL test_cubic:PASS:bpf_cubic__open_and_load 0 nsec test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/2 bpf_tcp_ca/cubic:FAIL test_dctcp_fallback:PASS:dctcp_skel 0 nsec test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec test_dctcp_fallback:FAIL:dctcp link unexpected error: -524 #29/4 bpf_tcp_ca/dctcp_fallback:FAIL test_write_sk_pacing:PASS:open_and_load 0 nsec test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524 #29/6 bpf_tcp_ca/write_sk_pacing:FAIL test_update_ca:PASS:open 0 nsec test_update_ca:FAIL:attach_struct_ops unexpected error: -524 settcpca:FAIL:setsockopt unexpected setsockopt: actual -1 == expected -1 (network_helpers.c:99: errno: No such file or directory) Failed to call post_socket_cb start_test:FAIL:start_server_str unexpected start_server_str: actual -1 == expected -1 test_update_ca:FAIL:ca1_ca1_cnt unexpected ca1_ca1_cnt: actual 0 <= expected 0 #29/9 bpf_tcp_ca/update_ca:FAIL #29 bpf_tcp_ca:FAIL Caught signal #11! Stack trace: ./test_progs(crash_handler+0x28)[0x5555567ed91c] linux-vdso.so.1(__vdso_rt_sigreturn+0x0)[0x7ffffee408b0] ./test_progs(bpf_link__update_map+0x80)[0x555556824a78] ./test_progs(+0x94d68)[0x5555564c4d68] ./test_progs(test_bpf_tcp_ca+0xe8)[0x5555564c6a88] ./test_progs(+0x3bde54)[0x5555567ede54] ./test_progs(main+0x61c)[0x5555567efd54] /usr/lib64/libc.so.6(+0x22208)[0x7ffff2aaa208] /usr/lib64/libc.so.6(__libc_start_main+0xac)[0x7ffff2aaa30c] ./test_progs(_start+0x48)[0x55555646bca8] Segmentation fault '''
This is because "link" is NULL in that case. This patch adds null checks for link in bpf_tcp_ca to fix this error.
Signed-off-by: Geliang Tang tanggeliang@kylinos.cn --- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c index bceff5900016..e920ecf0e888 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c @@ -411,7 +411,8 @@ static void test_update_ca(void) return;
link = bpf_map__attach_struct_ops(skel->maps.ca_update_1); - ASSERT_OK_PTR(link, "attach_struct_ops"); + if (!ASSERT_OK_PTR(link, "attach_struct_ops")) + goto err;
do_test(&opts); saved_ca1_cnt = skel->bss->ca1_cnt; @@ -425,6 +426,7 @@ static void test_update_ca(void) ASSERT_GT(skel->bss->ca2_cnt, 0, "ca2_ca2_cnt");
bpf_link__destroy(link); +err: tcp_ca_update__destroy(skel); }
@@ -447,7 +449,8 @@ static void test_update_wrong(void) return;
link = bpf_map__attach_struct_ops(skel->maps.ca_update_1); - ASSERT_OK_PTR(link, "attach_struct_ops"); + if (!ASSERT_OK_PTR(link, "attach_struct_ops")) + goto err;
do_test(&opts); saved_ca1_cnt = skel->bss->ca1_cnt; @@ -460,6 +463,7 @@ static void test_update_wrong(void) ASSERT_GT(skel->bss->ca1_cnt, saved_ca1_cnt, "ca2_ca1_cnt");
bpf_link__destroy(link); +err: tcp_ca_update__destroy(skel); }
@@ -484,7 +488,8 @@ static void test_mixed_links(void) ASSERT_OK_PTR(link_nl, "attach_struct_ops_nl");
link = bpf_map__attach_struct_ops(skel->maps.ca_update_1); - ASSERT_OK_PTR(link, "attach_struct_ops"); + if (!ASSERT_OK_PTR(link, "attach_struct_ops")) + goto err;
do_test(&opts); ASSERT_GT(skel->bss->ca1_cnt, 0, "ca1_ca1_cnt"); @@ -493,6 +498,7 @@ static void test_mixed_links(void) ASSERT_ERR(err, "update_map");
bpf_link__destroy(link); +err: bpf_link__destroy(link_nl); tcp_ca_update__destroy(skel); } @@ -536,7 +542,8 @@ static void test_link_replace(void) bpf_link__destroy(link);
link = bpf_map__attach_struct_ops(skel->maps.ca_update_2); - ASSERT_OK_PTR(link, "attach_struct_ops_2nd"); + if (!ASSERT_OK_PTR(link, "attach_struct_ops_2nd")) + goto err;
/* BPF_F_REPLACE with a wrong old map Fd. It should fail! * @@ -559,6 +566,7 @@ static void test_link_replace(void)
bpf_link__destroy(link);
+err: tcp_ca_update__destroy(skel); }
On Tue, Jun 25, 2024 at 4:25 PM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
v2:
- add patch 2, a new fix for sk_msg_memcopy_from_iter.
- update patch 3, only test "sk->sk_prot->close" as Eric suggested.
- update patch 4, use "goto err" instead of "return" as Eduard suggested.
- add "fixes" tag for patch 1-3.
- change subject prefixes as "bpf-next" to trigger BPF CI.
- cc Loongarch maintainers too.
BPF selftests seem to have not been fully tested on Loongarch. When I ran these tests on Loongarch recently, some errors occur. This patch set contains some null-check related fixes for these errors.
Is the root cause that LoongArch lacks bpf trampoline?
Huacai
Geliang Tang (4): skmsg: null check for sg_page in sk_msg_recvmsg skmsg: null check for sg_page in sk_msg_memcopy_from_iter inet: null check for close in inet_release selftests/bpf: Null checks for link in bpf_tcp_ca
net/core/skmsg.c | 4 ++++ net/ipv4/af_inet.c | 3 ++- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 16 ++++++++++++---- 3 files changed, 18 insertions(+), 5 deletions(-)
-- 2.43.0
On Tue, 2024-06-25 at 16:29 +0800, Huacai Chen wrote:
On Tue, Jun 25, 2024 at 4:25 PM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
v2: - add patch 2, a new fix for sk_msg_memcopy_from_iter. - update patch 3, only test "sk->sk_prot->close" as Eric suggested. - update patch 4, use "goto err" instead of "return" as Eduard suggested. - add "fixes" tag for patch 1-3. - change subject prefixes as "bpf-next" to trigger BPF CI. - cc Loongarch maintainers too.
BPF selftests seem to have not been fully tested on Loongarch. When I ran these tests on Loongarch recently, some errors occur. This patch set contains some null-check related fixes for these errors.
Is the root cause that LoongArch lacks bpf trampoline?
No. These errors don't seem to be directly related to the lack of BPF trampoline. I have indeed got some errors since lacking BPF trampoline, which is probably like this:
test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/1 bpf_tcp_ca/dctcp:FAIL test_cubic:PASS:bpf_cubic__open_and_load 0 nsec test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/2 bpf_tcp_ca/cubic:FAIL test_dctcp_fallback:PASS:dctcp_skel 0 nsec test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec test_dctcp_fallback:FAIL:dctcp link unexpected error: -524 #29/4 bpf_tcp_ca/dctcp_fallback:FAIL test_write_sk_pacing:PASS:open_and_load 0 nsec test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524 #29/6 bpf_tcp_ca/write_sk_pacing:FAIL
Thanks, -Geliang
Huacai
Geliang Tang (4): skmsg: null check for sg_page in sk_msg_recvmsg skmsg: null check for sg_page in sk_msg_memcopy_from_iter inet: null check for close in inet_release selftests/bpf: Null checks for link in bpf_tcp_ca
net/core/skmsg.c | 4 ++++ net/ipv4/af_inet.c | 3 ++- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 16 ++++++++++++---- 3 files changed, 18 insertions(+), 5 deletions(-)
-- 2.43.0
On Tue, Jun 25, 2024 at 5:08 PM Geliang Tang geliang@kernel.org wrote:
On Tue, 2024-06-25 at 16:29 +0800, Huacai Chen wrote:
On Tue, Jun 25, 2024 at 4:25 PM Geliang Tang geliang@kernel.org wrote:
From: Geliang Tang tanggeliang@kylinos.cn
v2:
- add patch 2, a new fix for sk_msg_memcopy_from_iter.
- update patch 3, only test "sk->sk_prot->close" as Eric
suggested.
- update patch 4, use "goto err" instead of "return" as Eduard suggested.
- add "fixes" tag for patch 1-3.
- change subject prefixes as "bpf-next" to trigger BPF CI.
- cc Loongarch maintainers too.
BPF selftests seem to have not been fully tested on Loongarch. When I ran these tests on Loongarch recently, some errors occur. This patch set contains some null-check related fixes for these errors.
Is the root cause that LoongArch lacks bpf trampoline?
No. These errors don't seem to be directly related to the lack of BPF trampoline. I have indeed got some errors since lacking BPF trampoline, which is probably like this:
If so, these errors seem not specific to LoongArch.
Huacai
test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/1 bpf_tcp_ca/dctcp:FAIL test_cubic:PASS:bpf_cubic__open_and_load 0 nsec test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524 #29/2 bpf_tcp_ca/cubic:FAIL test_dctcp_fallback:PASS:dctcp_skel 0 nsec test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec test_dctcp_fallback:FAIL:dctcp link unexpected error: -524 #29/4 bpf_tcp_ca/dctcp_fallback:FAIL test_write_sk_pacing:PASS:open_and_load 0 nsec test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524 #29/6 bpf_tcp_ca/write_sk_pacing:FAIL
Thanks, -Geliang
Huacai
Geliang Tang (4): skmsg: null check for sg_page in sk_msg_recvmsg skmsg: null check for sg_page in sk_msg_memcopy_from_iter inet: null check for close in inet_release selftests/bpf: Null checks for link in bpf_tcp_ca
net/core/skmsg.c | 4 ++++ net/ipv4/af_inet.c | 3 ++- .../selftests/bpf/prog_tests/bpf_tcp_ca.c | 16 ++++++++++++---- 3 files changed, 18 insertions(+), 5 deletions(-)
-- 2.43.0
linux-kselftest-mirror@lists.linaro.org