From: Feng Zhou zhoufeng.zf@bytedance.com
When TCP over IPv4 via INET6 API, sk->sk_family is AF_INET6, but it is a v4 pkt. inet_csk(sk)->icsk_af_ops is ipv6_mapped and use ip_queue_xmit. Some sockopt did not take effect, such as tos.
0001: Use sk_is_inet helper to fix it. 0002: Setget_sockopt add a test for tcp over ipv4 via ipv6.
Changelog: v2->v3: Addressed comments from Eric Dumazet - Use sk_is_inet() helper Details in here: https://lore.kernel.org/bpf/CANn89i+9GmBLCdgsfH=WWe-tyFYpiO27wONyxaxiU6aOBC6...
v1->v2: Addressed comments from kernel test robot - Fix compilation error Details in here: https://lore.kernel.org/bpf/202408152058.YXAnhLgZ-lkp@intel.com/T/
Feng Zhou (2): bpf: Fix bpf_get/setsockopt to tos not take effect when TCP over IPv4 via INET6 API selftests/bpf: Setget_sockopt add a test for tcp over ipv4 via ipv6
net/core/filter.c | 7 +++- .../selftests/bpf/prog_tests/setget_sockopt.c | 33 +++++++++++++++++++ .../selftests/bpf/progs/setget_sockopt.c | 13 ++++++-- 3 files changed, 49 insertions(+), 4 deletions(-)
From: Feng Zhou zhoufeng.zf@bytedance.com
when TCP over IPv4 via INET6 API, bpf_get/setsockopt with ipv4 will fail, because sk->sk_family is AF_INET6. With ipv6 will success, not take effect, because inet_csk(sk)->icsk_af_ops is ipv6_mapped and use ip_queue_xmit, inet_sk(sk)->tos.
Bpf_get/setsockopt use sk_is_inet() helper to fix this case.
Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com --- net/core/filter.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/core/filter.c b/net/core/filter.c index e4a4454df5f9..90f4dbb8d2b5 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5399,7 +5399,12 @@ static int sol_ip_sockopt(struct sock *sk, int optname, char *optval, int *optlen, bool getopt) { - if (sk->sk_family != AF_INET) + + /* + * SOL_IP socket options are available on AF_INET and AF_INET6, for + * example, TCP over IPv4 via INET6 API. + */ + if (!sk_is_inet(sk)) return -EINVAL;
switch (optname) {
On Sat, Sep 14, 2024 at 12:32 PM Feng zhou zhoufeng.zf@bytedance.com wrote:
From: Feng Zhou zhoufeng.zf@bytedance.com
when TCP over IPv4 via INET6 API, bpf_get/setsockopt with ipv4 will fail, because sk->sk_family is AF_INET6. With ipv6 will success, not take effect, because inet_csk(sk)->icsk_af_ops is ipv6_mapped and use ip_queue_xmit, inet_sk(sk)->tos.
Bpf_get/setsockopt use sk_is_inet() helper to fix this case.
Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com
Reviewed-by: Eric Dumazet edumazet@google.com
On 9/14/24 3:32 AM, Feng zhou wrote:
From: Feng Zhou zhoufeng.zf@bytedance.com
when TCP over IPv4 via INET6 API, bpf_get/setsockopt with ipv4 will
I think you meant bpf_get/setsockopt with SOL_IP will fail. so s/ipv4/SOL_IP/?
fail, because sk->sk_family is AF_INET6. With ipv6 will success, not take effect, because inet_csk(sk)->icsk_af_ops is ipv6_mapped and use ip_queue_xmit, inet_sk(sk)->tos.
Change lgtm.
Patch 2 has a conflict, so can you please reword this commit message to reflect the latest change. e.g. afaik, this is no longer specific to mapped address or not.
Bpf_get/setsockopt use sk_is_inet() helper to fix this case.
Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com
net/core/filter.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/core/filter.c b/net/core/filter.c index e4a4454df5f9..90f4dbb8d2b5 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5399,7 +5399,12 @@ static int sol_ip_sockopt(struct sock *sk, int optname, char *optval, int *optlen, bool getopt) {
- if (sk->sk_family != AF_INET)
- /*
* SOL_IP socket options are available on AF_INET and AF_INET6, for
* example, TCP over IPv4 via INET6 API.
*/
- if (!sk_is_inet(sk)) return -EINVAL;
switch (optname) {
在 2024/10/1 10:27, Martin KaFai Lau 写道:
On 9/14/24 3:32 AM, Feng zhou wrote:
From: Feng Zhou zhoufeng.zf@bytedance.com
when TCP over IPv4 via INET6 API, bpf_get/setsockopt with ipv4 will
I think you meant bpf_get/setsockopt with SOL_IP will fail. so s/ipv4/SOL_IP/?
fail, because sk->sk_family is AF_INET6. With ipv6 will success, not take effect, because inet_csk(sk)->icsk_af_ops is ipv6_mapped and use ip_queue_xmit, inet_sk(sk)->tos.
Change lgtm.
Patch 2 has a conflict, so can you please reword this commit message to reflect the latest change. e.g. afaik, this is no longer specific to mapped address or not.
Sorry for taking so long to reply.
Will do, thanks.
Bpf_get/setsockopt use sk_is_inet() helper to fix this case.
Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com
net/core/filter.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/core/filter.c b/net/core/filter.c index e4a4454df5f9..90f4dbb8d2b5 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5399,7 +5399,12 @@ static int sol_ip_sockopt(struct sock *sk, int optname, char *optval, int *optlen, bool getopt) { - if (sk->sk_family != AF_INET)
+ /* + * SOL_IP socket options are available on AF_INET and AF_INET6, for + * example, TCP over IPv4 via INET6 API. + */ + if (!sk_is_inet(sk)) return -EINVAL; switch (optname) {
From: Feng Zhou zhoufeng.zf@bytedance.com
This patch adds a test for TCP over IPv4 via INET6 API.
Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com --- .../selftests/bpf/prog_tests/setget_sockopt.c | 33 +++++++++++++++++++ .../selftests/bpf/progs/setget_sockopt.c | 13 ++++++-- 2 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c index 7d4a9b3d3722..3cad92128e60 100644 --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c @@ -15,8 +15,11 @@
#define CG_NAME "/setget-sockopt-test"
+#define INT_PORT 8008 + static const char addr4_str[] = "127.0.0.1"; static const char addr6_str[] = "::1"; +static const char addr6_any_str[] = "::"; static struct setget_sockopt *skel; static int cg_fd;
@@ -67,6 +70,35 @@ static void test_tcp(int family) ASSERT_EQ(bss->nr_binddev, 2, "nr_bind"); }
+static void test_tcp_over_ipv4_via_ipv6(void) +{ + struct setget_sockopt__bss *bss = skel->bss; + int sfd, cfd; + + memset(bss, 0, sizeof(*bss)); + skel->bss->test_tcp_over_ipv4_via_ipv6 = 1; + + sfd = start_server(AF_INET6, SOCK_STREAM, + addr6_any_str, INT_PORT, 0); + if (!ASSERT_GE(sfd, 0, "start_server")) + return; + + cfd = connect_to_addr_str(AF_INET, SOCK_STREAM, addr4_str, INT_PORT, NULL); + if (!ASSERT_GE(cfd, 0, "connect_to_addr_str")) { + close(sfd); + return; + } + close(sfd); + close(cfd); + + ASSERT_EQ(bss->nr_listen, 1, "nr_listen"); + ASSERT_EQ(bss->nr_connect, 1, "nr_connect"); + ASSERT_EQ(bss->nr_active, 1, "nr_active"); + ASSERT_EQ(bss->nr_passive, 1, "nr_passive"); + ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create"); + ASSERT_EQ(bss->nr_binddev, 2, "nr_bind"); +} + static void test_udp(int family) { struct setget_sockopt__bss *bss = skel->bss; @@ -191,6 +223,7 @@ void test_setget_sockopt(void) test_udp(AF_INET); test_ktls(AF_INET6); test_ktls(AF_INET); + test_tcp_over_ipv4_via_ipv6();
done: setget_sockopt__destroy(skel); diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c index 60518aed1ffc..ff834d94dd23 100644 --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c @@ -20,6 +20,7 @@ int nr_connect; int nr_binddev; int nr_socket_post_create; int nr_fin_wait1; +int test_tcp_over_ipv4_via_ipv6;
struct sockopt_test { int opt; @@ -262,9 +263,15 @@ static int bpf_test_sockopt(void *ctx, struct sock *sk) if (n != ARRAY_SIZE(sol_ip_tests)) return -1; } else { - n = bpf_loop(ARRAY_SIZE(sol_ipv6_tests), bpf_test_ipv6_sockopt, &lc, 0); - if (n != ARRAY_SIZE(sol_ipv6_tests)) - return -1; + if (test_tcp_over_ipv4_via_ipv6) { + n = bpf_loop(ARRAY_SIZE(sol_ip_tests), bpf_test_ip_sockopt, &lc, 0); + if (n != ARRAY_SIZE(sol_ip_tests)) + return -1; + } else { + n = bpf_loop(ARRAY_SIZE(sol_ipv6_tests), bpf_test_ipv6_sockopt, &lc, 0); + if (n != ARRAY_SIZE(sol_ipv6_tests)) + return -1; + } }
return 0;
On 9/14/24 3:32 AM, Feng zhou wrote:
From: Feng Zhou zhoufeng.zf@bytedance.com
This patch adds a test for TCP over IPv4 via INET6 API.
Signed-off-by: Feng Zhou zhoufeng.zf@bytedance.com
.../selftests/bpf/prog_tests/setget_sockopt.c | 33 +++++++++++++++++++ .../selftests/bpf/progs/setget_sockopt.c | 13 ++++++-- 2 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c index 7d4a9b3d3722..3cad92128e60 100644 --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c @@ -15,8 +15,11 @@ #define CG_NAME "/setget-sockopt-test" +#define INT_PORT 8008
- static const char addr4_str[] = "127.0.0.1"; static const char addr6_str[] = "::1";
+static const char addr6_any_str[] = "::"; static struct setget_sockopt *skel; static int cg_fd; @@ -67,6 +70,35 @@ static void test_tcp(int family) ASSERT_EQ(bss->nr_binddev, 2, "nr_bind"); } +static void test_tcp_over_ipv4_via_ipv6(void) +{
- struct setget_sockopt__bss *bss = skel->bss;
- int sfd, cfd;
- memset(bss, 0, sizeof(*bss));
- skel->bss->test_tcp_over_ipv4_via_ipv6 = 1;
- sfd = start_server(AF_INET6, SOCK_STREAM,
addr6_any_str, INT_PORT, 0);
- if (!ASSERT_GE(sfd, 0, "start_server"))
return;
- cfd = connect_to_addr_str(AF_INET, SOCK_STREAM, addr4_str, INT_PORT, NULL);
- if (!ASSERT_GE(cfd, 0, "connect_to_addr_str")) {
close(sfd);
return;
- }
- close(sfd);
- close(cfd);
- ASSERT_EQ(bss->nr_listen, 1, "nr_listen");
- ASSERT_EQ(bss->nr_connect, 1, "nr_connect");
- ASSERT_EQ(bss->nr_active, 1, "nr_active");
- ASSERT_EQ(bss->nr_passive, 1, "nr_passive");
- ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create");
- ASSERT_EQ(bss->nr_binddev, 2, "nr_bind");
+}
- static void test_udp(int family) { struct setget_sockopt__bss *bss = skel->bss;
@@ -191,6 +223,7 @@ void test_setget_sockopt(void) test_udp(AF_INET); test_ktls(AF_INET6); test_ktls(AF_INET);
- test_tcp_over_ipv4_via_ipv6();
This has a conflict with commit d53050934e66.
pw-bot: cr
done: setget_sockopt__destroy(skel); diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c index 60518aed1ffc..ff834d94dd23 100644 --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c @@ -20,6 +20,7 @@ int nr_connect; int nr_binddev; int nr_socket_post_create; int nr_fin_wait1; +int test_tcp_over_ipv4_via_ipv6; struct sockopt_test { int opt; @@ -262,9 +263,15 @@ static int bpf_test_sockopt(void *ctx, struct sock *sk) if (n != ARRAY_SIZE(sol_ip_tests)) return -1; } else {
n = bpf_loop(ARRAY_SIZE(sol_ipv6_tests), bpf_test_ipv6_sockopt, &lc, 0);
if (n != ARRAY_SIZE(sol_ipv6_tests))
return -1;
if (test_tcp_over_ipv4_via_ipv6) {
n = bpf_loop(ARRAY_SIZE(sol_ip_tests), bpf_test_ip_sockopt, &lc, 0);
Can this bpf_loop(..., bpf_test_ip_sockopt, ...) be always run? Then the above test_tcp_over_ipv4_via_ipv6() addition will not be needed.
if (n != ARRAY_SIZE(sol_ip_tests))
return -1;
} else {
n = bpf_loop(ARRAY_SIZE(sol_ipv6_tests), bpf_test_ipv6_sockopt, &lc, 0);
if (n != ARRAY_SIZE(sol_ipv6_tests))
return -1;
}}
return 0;
linux-kselftest-mirror@lists.linaro.org