QUIC requires end to end encryption of the data. The application usually
prepares the data in clear text, encrypts and calls send() which implies
multiple copies of the data before the packets hit the networking stack.
Similar to kTLS, QUIC kernel offload of cryptography reduces the memory
pressure by reducing the number of copies.
The scope of kernel support is limited to the symmetric cryptography,
leaving the handshake to the user space library. For QUIC in particular,
the application packets that require symmetric cryptography are the 1RTT
packets with short headers. Kernel will encrypt the application packets
on transmission and decrypt on receive. This series implements Tx only,
because in QUIC server applications Tx outweighs Rx by orders of
magnitude.
Supporting the combination of QUIC and GSO requires the application to
correctly place the data and the kernel to correctly slice it. The
encryption process appends an arbitrary number of bytes (tag) to the end
of the message to authenticate it. The GSO value should include this
overhead, the offload would then subtract the tag size to parse the
input on Tx before chunking and encrypting it.
With the kernel cryptography, the buffer copy operation is conjoined
with the encryption operation. The memory bandwidth is reduced by 5-8%.
When devices supporting QUIC encryption in hardware come to the market,
we will be able to free further 7% of CPU utilization which is used
today for crypto operations.
Adel Abouchaev (6):
Documentation on QUIC kernel Tx crypto.
Define QUIC specific constants, control and data plane structures
Add UDP ULP operations, initialization and handling prototype
functions.
Implement QUIC offload functions
Add flow counters and Tx processing error counter
Add self tests for ULP operations, flow setup and crypto tests
Documentation/networking/index.rst | 1 +
Documentation/networking/quic.rst | 215 ++++
include/net/inet_sock.h | 2 +
include/net/netns/mib.h | 3 +
include/net/quic.h | 63 +
include/net/snmp.h | 6 +
include/net/udp.h | 33 +
include/uapi/linux/quic.h | 68 ++
include/uapi/linux/snmp.h | 9 +
include/uapi/linux/udp.h | 4 +
net/Kconfig | 1 +
net/Makefile | 1 +
net/ipv4/Makefile | 3 +-
net/ipv4/udp.c | 15 +
net/ipv4/udp_ulp.c | 192 +++
net/quic/Kconfig | 16 +
net/quic/Makefile | 8 +
net/quic/quic_main.c | 1533 ++++++++++++++++++++++++
net/quic/quic_proc.c | 45 +
security/security.c | 1 +
tools/testing/selftests/net/.gitignore | 1 +
tools/testing/selftests/net/Makefile | 3 +-
tools/testing/selftests/net/quic.c | 1369 +++++++++++++++++++++
tools/testing/selftests/net/quic.sh | 46 +
24 files changed, 3636 insertions(+), 2 deletions(-)
create mode 100644 Documentation/networking/quic.rst
create mode 100644 include/net/quic.h
create mode 100644 include/uapi/linux/quic.h
create mode 100644 net/ipv4/udp_ulp.c
create mode 100644 net/quic/Kconfig
create mode 100644 net/quic/Makefile
create mode 100644 net/quic/quic_main.c
create mode 100644 net/quic/quic_proc.c
create mode 100644 tools/testing/selftests/net/quic.c
create mode 100755 tools/testing/selftests/net/quic.sh
--
2.30.2
From: Vijay Dhanraj <vijay.dhanraj(a)intel.com>
Add a new test case which is same as augment_via_eaccept but adds a
larger number of EPC pages to stress test EAUG via EACCEPT.
Signed-off-by: Vijay Dhanraj <vijay.dhanraj(a)intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v8:
- Specify dynamic heap size in side the test case.
v7:
- Contains now only the test case. Support for dynamic heap is
prepared in prepending patches.
v6:
- Address Reinette's feedback:
https://lore.kernel.org/linux-sgx/Yw6%2FiTzSdSw%2FY%2FVO@kernel.org/
v5:
- Add the klog dump and sysctl option to the commit message.
v4:
- Explain expectations for dirty_page_list in the function header, instead
of an inline comment.
- Improve commit message to explain the conditions better.
- Return the number of pages left dirty to ksgxd() and print warning after
the 2nd call, if there are any.
v3:
- Remove WARN_ON().
- Tuned comments and the commit message a bit.
v2:
- Replaced WARN_ON() with optional pr_info() inside
__sgx_sanitize_pages().
- Rewrote the commit message.
- Added the fixes tag.
---
tools/testing/selftests/sgx/main.c | 112 ++++++++++++++++++++++++++++-
1 file changed, 111 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 78c3b913ce10..e596b45bc5f8 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -22,8 +22,10 @@
#include "main.h"
static const size_t ENCL_HEAP_SIZE_DEFAULT = PAGE_SIZE;
+static const unsigned long TIMEOUT_DEFAULT = 900;
static const uint64_t MAGIC = 0x1122334455667788ULL;
static const uint64_t MAGIC2 = 0x8877665544332211ULL;
+
vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave;
/*
@@ -387,7 +389,7 @@ TEST_F(enclave, unclobbered_vdso_oversubscribed)
EXPECT_EQ(self->run.user_data, 0);
}
-TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
+TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, TIMEOUT_DEFAULT)
{
struct sgx_enclave_remove_pages remove_ioc;
struct sgx_enclave_modify_types modt_ioc;
@@ -1245,6 +1247,114 @@ TEST_F(enclave, augment_via_eaccept)
munmap(addr, PAGE_SIZE);
}
+/*
+ * Test for the addition of large number of pages to an initialized enclave via
+ * a pre-emptive run of EACCEPT on every page to be added.
+ */
+TEST_F_TIMEOUT(enclave, augment_via_eaccept_long, TIMEOUT_DEFAULT)
+{
+ /*
+ * The dynamic heap size was chosen based on a bug report:
+ * Message-ID:
+ * <DM8PR11MB55912A7F47A84EC9913A6352F6999(a)DM8PR11MB5591.namprd11.prod.outlook.com>
+ */
+ static const unsigned long DYNAMIC_HEAP_SIZE = 0x200000L * PAGE_SIZE;
+ struct encl_op_get_from_addr get_addr_op;
+ struct encl_op_put_to_addr put_addr_op;
+ struct encl_op_eaccept eaccept_op;
+ size_t total_size = 0;
+ unsigned long i;
+ void *addr;
+
+ if (!sgx2_supported())
+ SKIP(return, "SGX2 not supported");
+
+ ASSERT_TRUE(setup_test_encl_dynamic(ENCL_HEAP_SIZE_DEFAULT, DYNAMIC_HEAP_SIZE,
+ &self->encl, _metadata));
+
+ memset(&self->run, 0, sizeof(self->run));
+ self->run.tcs = self->encl.encl_base;
+
+ for (i = 0; i < self->encl.nr_segments; i++) {
+ struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+ total_size += seg->size;
+ }
+
+ /*
+ * mmap() every page at end of existing enclave to be used for
+ * EDMM.
+ */
+ addr = mmap((void *)self->encl.encl_base + total_size, DYNAMIC_HEAP_SIZE,
+ PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_FIXED,
+ self->encl.fd, 0);
+ EXPECT_NE(addr, MAP_FAILED);
+
+ self->run.exception_vector = 0;
+ self->run.exception_error_code = 0;
+ self->run.exception_addr = 0;
+
+ /*
+ * Run EACCEPT on every page to trigger the #PF->EAUG->EACCEPT(again
+ * without a #PF). All should be transparent to userspace.
+ */
+ eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+ eaccept_op.ret = 0;
+ eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+ for (i = 0; i < DYNAMIC_HEAP_SIZE; i += PAGE_SIZE) {
+ eaccept_op.epc_addr = (uint64_t)(addr + i);
+
+ EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+ if (self->run.exception_vector == 14 &&
+ self->run.exception_error_code == 4 &&
+ self->run.exception_addr == self->encl.encl_base) {
+ munmap(addr, DYNAMIC_HEAP_SIZE);
+ SKIP(return, "Kernel does not support adding pages to initialized enclave");
+ }
+
+ EXPECT_EQ(self->run.exception_vector, 0);
+ EXPECT_EQ(self->run.exception_error_code, 0);
+ EXPECT_EQ(self->run.exception_addr, 0);
+ ASSERT_EQ(eaccept_op.ret, 0);
+ ASSERT_EQ(self->run.function, EEXIT);
+ }
+
+ /*
+ * Pool of pages were successfully added to enclave. Perform sanity
+ * check on first page of the pool only to ensure data can be written
+ * to and read from a dynamically added enclave page.
+ */
+ put_addr_op.value = MAGIC;
+ put_addr_op.addr = (unsigned long)addr;
+ put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+ EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+ EXPECT_EEXIT(&self->run);
+ EXPECT_EQ(self->run.exception_vector, 0);
+ EXPECT_EQ(self->run.exception_error_code, 0);
+ EXPECT_EQ(self->run.exception_addr, 0);
+
+ /*
+ * Read memory from newly added page that was just written to,
+ * confirming that data previously written (MAGIC) is present.
+ */
+ get_addr_op.value = 0;
+ get_addr_op.addr = (unsigned long)addr;
+ get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+ EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+ EXPECT_EQ(get_addr_op.value, MAGIC);
+ EXPECT_EEXIT(&self->run);
+ EXPECT_EQ(self->run.exception_vector, 0);
+ EXPECT_EQ(self->run.exception_error_code, 0);
+ EXPECT_EQ(self->run.exception_addr, 0);
+
+ munmap(addr, DYNAMIC_HEAP_SIZE);
+}
+
/*
* SGX2 page type modification test in two phases:
* Phase 1:
--
2.37.2
From: Roberto Sassu <roberto.sassu(a)huawei.com>
Add a missing fd modes check in map iterators, potentially causing
unauthorized map writes by eBPF programs attached to the iterator. Use this
patch set as an opportunity to start a discussion with the cgroup
developers about whether a security check is missing or not for their
iterator.
Also, extend libbpf with the _opts variant of bpf_*_get_fd_by_id(). Only
bpf_map_get_fd_by_id_opts() is really useful in this patch set, to ensure
that the creation of a map iterator fails with a read-only fd.
Add all variants in this patch set for symmetry with
bpf_map_get_fd_by_id_opts(), and because all the variants share the same
opts structure. Also, add all the variants here, to shrink the patch set
fixing map permissions requested by bpftool, so that the remaining patches
are only about the latter.
Finally, extend the bpf_iter test with the read-only fd check, and test
each _opts variant of bpf_*_get_fd_by_id().
Roberto Sassu (7):
bpf: Add missing fd modes check for map iterators
libbpf: Define bpf_get_fd_opts and introduce
bpf_map_get_fd_by_id_opts()
libbpf: Introduce bpf_prog_get_fd_by_id_opts()
libbpf: Introduce bpf_btf_get_fd_by_id_opts()
libbpf: Introduce bpf_link_get_fd_by_id_opts()
selftests/bpf: Ensure fd modes are checked for map iters and destroy
links
selftests/bpf: Add tests for _opts variants of libbpf
include/linux/bpf.h | 2 +-
kernel/bpf/inode.c | 2 +-
kernel/bpf/map_iter.c | 3 +-
kernel/bpf/syscall.c | 8 +-
net/core/bpf_sk_storage.c | 3 +-
net/core/sock_map.c | 3 +-
tools/lib/bpf/bpf.c | 47 +++++-
tools/lib/bpf/bpf.h | 16 ++
tools/lib/bpf/libbpf.map | 10 +-
tools/lib/bpf/libbpf_version.h | 2 +-
.../selftests/bpf/prog_tests/bpf_iter.c | 34 +++-
.../bpf/prog_tests/libbpf_get_fd_opts.c | 145 ++++++++++++++++++
.../bpf/progs/test_libbpf_get_fd_opts.c | 49 ++++++
13 files changed, 309 insertions(+), 15 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/libbpf_get_fd_opts.c
create mode 100644 tools/testing/selftests/bpf/progs/test_libbpf_get_fd_opts.c
--
2.25.1
Delete the redundant word 'in'.
Signed-off-by: wangjianli <wangjianli(a)cdjrlc.com>
---
tools/testing/selftests/cgroup/test_freezer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_freezer.c b/tools/testing/selftests/cgroup/test_freezer.c
index ff519029f6f4..b479434e87b7 100644
--- a/tools/testing/selftests/cgroup/test_freezer.c
+++ b/tools/testing/selftests/cgroup/test_freezer.c
@@ -740,7 +740,7 @@ static int test_cgfreezer_ptraced(const char *root)
/*
* cg_check_frozen(cgroup, true) will fail here,
- * because the task in in the TRACEd state.
+ * because the task in the TRACEd state.
*/
if (cg_freeze_wait(cgroup, false))
goto cleanup;
--
2.36.1