This patch series adds a memory.reclaim proactive reclaim interface.
The rationale behind the interface and how it works are in the first
patch.
---
Changes in V2:
- Add the interface to root as well.
- Added a selftest.
- Documented the interface as a nested-keyed interface, which makes
adding optional arguments in the future easier (see doc updates in the
first patch).
- Modified the commit message to reflect changes and add a timeout
argument as a suggested possible extension
- Return -EAGAIN if the kernel fails to reclaim the full requested
amount.
---
Shakeel Butt (1):
memcg: introduce per-memcg reclaim interface
Yosry Ahmed (3):
selftests: cgroup: return the errno of write() in cg_write() on
failure
selftests: cgroup: fix alloc_anon_noexit() instantly freeing memory
selftests: cgroup: add a selftest for memory.reclaim
Documentation/admin-guide/cgroup-v2.rst | 21 +++++
mm/memcontrol.c | 37 ++++++++
tools/testing/selftests/cgroup/cgroup_util.c | 11 ++-
.../selftests/cgroup/test_memcontrol.c | 94 ++++++++++++++++++-
4 files changed, 156 insertions(+), 7 deletions(-)
--
2.35.1.1178.g4f1659d476-goog
The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate
CPU set. This cpu set is used further in pthread_attr_setaffinity_np
and by pthread_create in the code. But in current code, allocated
cpu set is not freed. Fix this by adding CPU_FREE after its usage
is done.
Signed-off-by: Athira Rajeev <atrajeev(a)linux.vnet.ibm.com>
---
tools/testing/selftests/mqueue/mq_perf_tests.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c
index b019e0b8221c..17c41f216bef 100644
--- a/tools/testing/selftests/mqueue/mq_perf_tests.c
+++ b/tools/testing/selftests/mqueue/mq_perf_tests.c
@@ -732,6 +732,7 @@ int main(int argc, char *argv[])
pthread_attr_destroy(&thread_attr);
}
+ CPU_FREE(cpu_set);
if (!continuous_mode) {
pthread_join(cpu_threads[0], &retval);
shutdown((long)retval, "perf_test_thread()", __LINE__);
--
2.35.1
bpf_tcp_gen_syncookie looks at the IP version in the IP header and
validates the address family of the socket. It supports IPv4 packets in
AF_INET6 dual-stack sockets.
On the other hand, bpf_tcp_check_syncookie looks only at the address
family of the socket, ignoring the real IP version in headers, and
validates only the packet size. This implementation has some drawbacks:
1. Packets are not validated properly, allowing a BPF program to trick
bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4
socket.
2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end
up receiving a SYNACK with the cookie, but the following ACK gets
dropped.
This patch fixes these issues by changing the checks in
bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP
version from the header is taken into account, and it is validated
properly with address family.
Fixes: 399040847084 ("bpf: add helper to check for a valid SYN cookie")
Signed-off-by: Maxim Mikityanskiy <maximmi(a)nvidia.com>
Reviewed-by: Tariq Toukan <tariqt(a)nvidia.com>
Acked-by: Arthur Fabre <afabre(a)cloudflare.com>
---
net/core/filter.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index a7044e98765e..64470a727ef7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -7016,24 +7016,33 @@ BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len
if (!th->ack || th->rst || th->syn)
return -ENOENT;
+ if (unlikely(iph_len < sizeof(struct iphdr)))
+ return -EINVAL;
+
if (tcp_synq_no_recent_overflow(sk))
return -ENOENT;
cookie = ntohl(th->ack_seq) - 1;
- switch (sk->sk_family) {
- case AF_INET:
- if (unlikely(iph_len < sizeof(struct iphdr)))
+ /* Both struct iphdr and struct ipv6hdr have the version field at the
+ * same offset so we can cast to the shorter header (struct iphdr).
+ */
+ switch (((struct iphdr *)iph)->version) {
+ case 4:
+ if (sk->sk_family == AF_INET6 && ipv6_only_sock(sk))
return -EINVAL;
ret = __cookie_v4_check((struct iphdr *)iph, th, cookie);
break;
#if IS_BUILTIN(CONFIG_IPV6)
- case AF_INET6:
+ case 6:
if (unlikely(iph_len < sizeof(struct ipv6hdr)))
return -EINVAL;
+ if (sk->sk_family != AF_INET6)
+ return -EINVAL;
+
ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie);
break;
#endif /* CONFIG_IPV6 */
--
2.30.2
We have switched to memcg based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.
This patchset cleanups the usage of RLIMIT_MEMLOCK in tools/bpf/,
tools/testing/selftests/bpf and samples/bpf. The file
tools/testing/selftests/bpf/bpf_rlimit.h is removed. The included header
sys/resource.h is removed from many files as it is useless in these files.
- v3: Get rid of bpf_rlimit.h and fix some typos (Andrii)
- v2: Use libbpf_set_strict_mode instead. (Andrii)
- v1: https://lore.kernel.org/bpf/20220320060815.7716-2-laoar.shao@gmail.com/
Yafang Shao (27):
bpf: selftests: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xdping
bpf: selftests: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xdpxceiver
bpf: selftests: No need to include bpf_rlimit.h in test_tcpnotify_user
bpf: selftests: No need to include bpf_rlimit.h in flow_dissector_load
bpf: selftests: Set libbpf 1.0 API mode explicitly in
get_cgroup_id_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_cgroup_storage
bpf: selftests: Set libbpf 1.0 API mode explicitly in
get_cgroup_id_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_lpm_map
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_lru_map
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_skb_cgroup_id_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sock_addr
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sock
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sockmap
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_sysctl
bpf: selftests: Set libbpf 1.0 API mode explicitly in test_tag
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_tcp_check_syncookie_user
bpf: selftests: Set libbpf 1.0 API mode explicitly in
test_verifier_log
bpf: samples: Set libbpf 1.0 API mode explicitly in hbm
bpf: selftests: Get rid of bpf_rlimit.h
bpf: selftests: No need to include sys/resource.h in some files
bpf: samples: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xdpsock_user
bpf: samples: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK in
xsk_fwd
bpf: samples: No need to include sys/resource.h in many files
bpf: bpftool: Remove useless return value of libbpf_set_strict_mode
bpf: bpftool: Set LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK for legacy libbpf
bpf: bpftool: remove RLIMIT_MEMLOCK
bpf: runqslower: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK
samples/bpf/cpustat_user.c | 1 -
samples/bpf/hbm.c | 5 ++--
samples/bpf/ibumad_user.c | 1 -
samples/bpf/map_perf_test_user.c | 1 -
samples/bpf/offwaketime_user.c | 1 -
samples/bpf/sockex2_user.c | 1 -
samples/bpf/sockex3_user.c | 1 -
samples/bpf/spintest_user.c | 1 -
samples/bpf/syscall_tp_user.c | 1 -
samples/bpf/task_fd_query_user.c | 1 -
samples/bpf/test_lru_dist.c | 1 -
samples/bpf/test_map_in_map_user.c | 1 -
samples/bpf/test_overhead_user.c | 1 -
samples/bpf/tracex2_user.c | 1 -
samples/bpf/tracex3_user.c | 1 -
samples/bpf/tracex4_user.c | 1 -
samples/bpf/tracex5_user.c | 1 -
samples/bpf/tracex6_user.c | 1 -
samples/bpf/xdp1_user.c | 1 -
samples/bpf/xdp_adjust_tail_user.c | 1 -
samples/bpf/xdp_monitor_user.c | 1 -
samples/bpf/xdp_redirect_cpu_user.c | 1 -
samples/bpf/xdp_redirect_map_multi_user.c | 1 -
samples/bpf/xdp_redirect_user.c | 1 -
samples/bpf/xdp_router_ipv4_user.c | 1 -
samples/bpf/xdp_rxq_info_user.c | 1 -
samples/bpf/xdp_sample_pkts_user.c | 1 -
samples/bpf/xdp_sample_user.c | 1 -
samples/bpf/xdp_tx_iptunnel_user.c | 1 -
samples/bpf/xdpsock_user.c | 9 ++----
samples/bpf/xsk_fwd.c | 7 ++---
tools/bpf/bpftool/common.c | 8 ------
tools/bpf/bpftool/feature.c | 2 --
tools/bpf/bpftool/main.c | 6 ++--
tools/bpf/bpftool/main.h | 2 --
tools/bpf/bpftool/map.c | 2 --
tools/bpf/bpftool/pids.c | 1 -
tools/bpf/bpftool/prog.c | 3 --
tools/bpf/bpftool/struct_ops.c | 2 --
tools/bpf/runqslower/runqslower.c | 18 ++----------
tools/testing/selftests/bpf/bench.c | 1 -
tools/testing/selftests/bpf/bpf_rlimit.h | 28 -------------------
.../selftests/bpf/flow_dissector_load.c | 6 ++--
.../selftests/bpf/get_cgroup_id_user.c | 4 ++-
tools/testing/selftests/bpf/prog_tests/btf.c | 1 -
.../selftests/bpf/test_cgroup_storage.c | 4 ++-
tools/testing/selftests/bpf/test_dev_cgroup.c | 4 ++-
tools/testing/selftests/bpf/test_lpm_map.c | 4 ++-
tools/testing/selftests/bpf/test_lru_map.c | 4 ++-
.../selftests/bpf/test_skb_cgroup_id_user.c | 4 ++-
tools/testing/selftests/bpf/test_sock.c | 4 ++-
tools/testing/selftests/bpf/test_sock_addr.c | 4 ++-
tools/testing/selftests/bpf/test_sockmap.c | 5 ++--
tools/testing/selftests/bpf/test_sysctl.c | 4 ++-
tools/testing/selftests/bpf/test_tag.c | 4 ++-
.../bpf/test_tcp_check_syncookie_user.c | 4 ++-
.../selftests/bpf/test_tcpnotify_user.c | 1 -
.../testing/selftests/bpf/test_verifier_log.c | 5 ++--
.../selftests/bpf/xdp_redirect_multi.c | 1 -
tools/testing/selftests/bpf/xdping.c | 8 ++----
tools/testing/selftests/bpf/xdpxceiver.c | 6 ++--
61 files changed, 57 insertions(+), 142 deletions(-)
delete mode 100644 tools/testing/selftests/bpf/bpf_rlimit.h
--
2.17.1