Hello,
This patchset is our exploration of how to support 1G pages in guest_memfd, and
how the pages will be used in Confidential VMs.
The patchset covers:
+ How to get 1G pages
+ Allowing mmap() of guest_memfd to userspace so that both private and shared
memory can use the same physical pages
+ Splitting and reconstructing pages to support conversions and mmap()
+ How the VM, userspace and guest_memfd interact to support conversions
+ Selftests to test all the above
+ Selftests also demonstrate the conversion flow between VM, userspace and
guest_memfd.
Why 1G pages in guest memfd?
Bring guest_memfd to performance and memory savings parity with VMs that are
backed by HugeTLBfs.
+ Performance is improved with 1G pages by more TLB hits and faster page walks
on TLB misses.
+ Memory savings from 1G pages comes from HugeTLB Vmemmap Optimization (HVO).
Options for 1G page support:
1. HugeTLB
2. Contiguous Memory Allocator (CMA)
3. Other suggestions are welcome!
Comparison between options:
1. HugeTLB
+ Refactor HugeTLB to separate allocator from the rest of HugeTLB
+ Pro: Graceful transition for VMs backed with HugeTLB to guest_memfd
+ Near term: Allows co-tenancy of HugeTLB and guest_memfd backed VMs
+ Pro: Can provide iterative steps toward new future allocator
+ Unexplored: Managing userspace-visible changes
+ e.g. HugeTLB's free_hugepages will decrease if HugeTLB is used,
but not when future allocator is used
2. CMA
+ Port some HugeTLB features to be applied on CMA
+ Pro: Clean slate
What would refactoring HugeTLB involve?
(Some refactoring was done in this RFC, more can be done.)
1. Broadly involves separating the HugeTLB allocator from the rest of HugeTLB
+ Brings more modularity to HugeTLB
+ No functionality change intended
+ Likely step towards HugeTLB's integration into core-mm
2. guest_memfd will use just the allocator component of HugeTLB, not including
the complex parts of HugeTLB like
+ Userspace reservations (resv_map)
+ Shared PMD mappings
+ Special page walkers
What features would need to be ported to CMA?
+ Improved allocation guarantees
+ Per NUMA node pool of huge pages
+ Subpools per guest_memfd
+ Memory savings
+ Something like HugeTLB Vmemmap Optimization
+ Configuration/reporting features
+ Configuration of number of pages available (and per NUMA node) at and
after host boot
+ Reporting of memory usage/availability statistics at runtime
HugeTLB was picked as the source of 1G pages for this RFC because it allows a
graceful transition, and retains memory savings from HVO.
To illustrate this, if a host machine uses HugeTLBfs to back VMs, and a
confidential VM were to be scheduled on that host, some HugeTLBfs pages would
have to be given up and returned to CMA for guest_memfd pages to be rebuilt from
that memory. This requires memory to be reserved for HVO to be removed and
reapplied on the new guest_memfd memory. This not only slows down memory
allocation but also trims the benefits of HVO. Memory would have to be reserved
on the host to facilitate these transitions.
Improving how guest_memfd uses the allocator in a future revision of this RFC:
To provide an easier transition away from HugeTLB, guest_memfd's use of HugeTLB
should be limited to these allocator functions:
+ reserve(node, page_size, num_pages) => opaque handle
+ Used when a guest_memfd inode is created to reserve memory from backend
allocator
+ allocate(handle, mempolicy, page_size) => folio
+ To allocate a folio from guest_memfd's reservation
+ split(handle, folio, target_page_size) => void
+ To take a huge folio, and split it to smaller folios, restore to filemap
+ reconstruct(handle, first_folio, nr_pages) => void
+ To take a folio, and reconstruct a huge folio out of nr_pages from the
first_folio
+ free(handle, folio) => void
+ To return folio to guest_memfd's reservation
+ error(handle, folio) => void
+ To handle memory errors
+ unreserve(handle) => void
+ To return guest_memfd's reservation to allocator backend
Userspace should only provide a page size when creating a guest_memfd and should
not have to specify HugeTLB.
Overview of patches:
+ Patches 01-12
+ Many small changes to HugeTLB, mostly to separate HugeTLBfs concepts from
HugeTLB, and to expose HugeTLB functions.
+ Patches 13-16
+ Letting guest_memfd use HugeTLB
+ Creation of each guest_memfd reserves pages from HugeTLB's global hstate
and puts it into the guest_memfd inode's subpool
+ Each folio allocation takes a page from the guest_memfd inode's subpool
+ Patches 17-21
+ Selftests for new HugeTLB features in guest_memfd
+ Patches 22-24
+ More small changes on the HugeTLB side to expose functions needed by
guest_memfd
+ Patch 25:
+ Uses the newly available functions from patches 22-24 to split HugeTLB
pages. In this patch, HugeTLB folios are always split to 4K before any
usage, private or shared.
+ Patches 26-28
+ Allow mmap() in guest_memfd and faulting in shared pages
+ Patch 29
+ Enables conversion between private/shared pages
+ Patch 30
+ Required to zero folios after conversions to avoid leaking initialized
kernel memory
+ Patch 31-38
+ Add selftests to test mapping pages to userspace, guest/host memory
sharing and update conversions tests
+ Patch 33 illustrates the conversion flow between VM/userspace/guest_memfd
+ Patch 39
+ Dynamically split and reconstruct HugeTLB pages instead of always
splitting before use. All earlier selftests are expected to still pass.
TODOs:
+ Add logic to wait for safe_refcount [1]
+ Look into lazy splitting/reconstruction of pages
+ Currently, when the KVM_SET_MEMORY_ATTRIBUTES is invoked, not only is the
mem_attr_array and faultability updated, the pages in the requested range
are also split/reconstructed as necessary. We want to look into delaying
splitting/reconstruction to fault time.
+ Solve race between folios being faulted in and being truncated
+ When running private_mem_conversions_test with more than 1 vCPU, a folio
getting truncated may get faulted in by another process, causing elevated
mapcounts when the folio is freed (VM_BUG_ON_FOLIO).
+ Add intermediate splits (1G should first split to 2M and not split directly to
4K)
+ Use guest's lock instead of hugetlb_lock
+ Use multi-index xarray/replace xarray with some other data struct for
faultability flag
+ Refactor HugeTLB better, present generic allocator interface
Please let us know your thoughts on:
+ HugeTLB as the choice of transitional allocator backend
+ Refactoring HugeTLB to provide generic allocator interface
+ Shared/private conversion flow
+ Requiring user to request kernel to unmap pages from userspace using
madvise(MADV_DONTNEED)
+ Failing conversion on elevated mapcounts/pincounts/refcounts
+ Process of splitting/reconstructing page
+ Anything else!
[1] https://lore.kernel.org/all/20240829-guest-memfd-lib-v2-0-b9afc1ff3656@quic…
Ackerley Tng (37):
mm: hugetlb: Simplify logic in dequeue_hugetlb_folio_vma()
mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv()
mm: hugetlb: Remove unnecessary check for avoid_reserve
mm: mempolicy: Refactor out policy_node_nodemask()
mm: hugetlb: Refactor alloc_buddy_hugetlb_folio_with_mpol() to
interpret mempolicy instead of vma
mm: hugetlb: Refactor dequeue_hugetlb_folio_vma() to use mpol
mm: hugetlb: Refactor out hugetlb_alloc_folio
mm: truncate: Expose preparation steps for truncate_inode_pages_final
mm: hugetlb: Expose hugetlb_subpool_{get,put}_pages()
mm: hugetlb: Add option to create new subpool without using surplus
mm: hugetlb: Expose hugetlb_acct_memory()
mm: hugetlb: Move and expose hugetlb_zero_partial_page()
KVM: guest_memfd: Make guest mem use guest mem inodes instead of
anonymous inodes
KVM: guest_memfd: hugetlb: initialization and cleanup
KVM: guest_memfd: hugetlb: allocate and truncate from hugetlb
KVM: guest_memfd: Add page alignment check for hugetlb guest_memfd
KVM: selftests: Add basic selftests for hugetlb-backed guest_memfd
KVM: selftests: Support various types of backing sources for private
memory
KVM: selftests: Update test for various private memory backing source
types
KVM: selftests: Add private_mem_conversions_test.sh
KVM: selftests: Test that guest_memfd usage is reported via hugetlb
mm: hugetlb: Expose vmemmap optimization functions
mm: hugetlb: Expose HugeTLB functions for promoting/demoting pages
mm: hugetlb: Add functions to add/move/remove from hugetlb lists
KVM: guest_memfd: Track faultability within a struct kvm_gmem_private
KVM: guest_memfd: Allow mmapping guest_memfd files
KVM: guest_memfd: Use vm_type to determine default faultability
KVM: Handle conversions in the SET_MEMORY_ATTRIBUTES ioctl
KVM: guest_memfd: Handle folio preparation for guest_memfd mmap
KVM: selftests: Allow vm_set_memory_attributes to be used without
asserting return value of 0
KVM: selftests: Test using guest_memfd memory from userspace
KVM: selftests: Test guest_memfd memory sharing between guest and host
KVM: selftests: Add notes in private_mem_kvm_exits_test for mmap-able
guest_memfd
KVM: selftests: Test that pinned pages block KVM from setting memory
attributes to PRIVATE
KVM: selftests: Refactor vm_mem_add to be more flexible
KVM: selftests: Add helper to perform madvise by memslots
KVM: selftests: Update private_mem_conversions_test for mmap()able
guest_memfd
Vishal Annapurve (2):
KVM: guest_memfd: Split HugeTLB pages for guest_memfd use
KVM: guest_memfd: Dynamically split/reconstruct HugeTLB page
fs/hugetlbfs/inode.c | 35 +-
include/linux/hugetlb.h | 54 +-
include/linux/kvm_host.h | 1 +
include/linux/mempolicy.h | 2 +
include/linux/mm.h | 1 +
include/uapi/linux/kvm.h | 26 +
include/uapi/linux/magic.h | 1 +
mm/hugetlb.c | 346 ++--
mm/hugetlb_vmemmap.h | 11 -
mm/mempolicy.c | 36 +-
mm/truncate.c | 26 +-
tools/include/linux/kernel.h | 4 +-
tools/testing/selftests/kvm/Makefile | 3 +
.../kvm/guest_memfd_hugetlb_reporting_test.c | 222 +++
.../selftests/kvm/guest_memfd_pin_test.c | 104 ++
.../selftests/kvm/guest_memfd_sharing_test.c | 160 ++
.../testing/selftests/kvm/guest_memfd_test.c | 238 ++-
.../testing/selftests/kvm/include/kvm_util.h | 45 +-
.../testing/selftests/kvm/include/test_util.h | 18 +
tools/testing/selftests/kvm/lib/kvm_util.c | 443 +++--
tools/testing/selftests/kvm/lib/test_util.c | 99 ++
.../kvm/x86_64/private_mem_conversions_test.c | 158 +-
.../x86_64/private_mem_conversions_test.sh | 91 +
.../kvm/x86_64/private_mem_kvm_exits_test.c | 11 +-
virt/kvm/guest_memfd.c | 1563 ++++++++++++++++-
virt/kvm/kvm_main.c | 17 +
virt/kvm/kvm_mm.h | 16 +
27 files changed, 3288 insertions(+), 443 deletions(-)
create mode 100644 tools/testing/selftests/kvm/guest_memfd_hugetlb_reporting_test.c
create mode 100644 tools/testing/selftests/kvm/guest_memfd_pin_test.c
create mode 100644 tools/testing/selftests/kvm/guest_memfd_sharing_test.c
create mode 100755 tools/testing/selftests/kvm/x86_64/private_mem_conversions_test.sh
--
2.46.0.598.g6f2099f65c-goog
Greetings:
Welcome to v4.
This series fixes netdevsim to correctly set the NAPI ID on the skb.
This is helpful for writing tests around features that use
SO_INCOMING_NAPI_ID.
In addition to the netdevsim fix in patch 1, patches 2 & 3 do some self
test refactoring and add a test for NAPI IDs. The test itself (patch 3)
introduces a C helper because apparently python doesn't have
socket.SO_INCOMING_NAPI_ID.
Thanks,
Joe
v4:
- Updated the macro guard in patch 2
- Removed the remote deploy from patch 3
v3: https://lore.kernel.org/netdev/20250418013719.12094-1-jdamato@fastly.com/
- Dropped patch 3 from v2 as it is no longer necessary.
- Patch 3 from this series (which was patch 4 in the v2)
- Sorted .gitignore alphabetically
- added cfg.remote_deploy so the test supports real remote machines
- Dropped the NetNSEnter as it is unnecessary
- Fixed a string interpolation issue that Paolo hit with his Python
version
v2: https://lore.kernel.org/netdev/20250417013301.39228-1-jdamato@fastly.com/
- No longer an RFC
- Minor whitespace change in patch 1 (no functional change).
- Patches 2-4 new in v2
rfcv1: https://lore.kernel.org/netdev/20250329000030.39543-1-jdamato@fastly.com/
Joe Damato (3):
netdevsim: Mark NAPI ID on skb in nsim_rcv
selftests: drv-net: Factor out ksft C helpers
selftests: drv-net: Test that NAPI ID is non-zero
drivers/net/netdevsim/netdev.c | 2 +
.../testing/selftests/drivers/net/.gitignore | 1 +
tools/testing/selftests/drivers/net/Makefile | 6 +-
tools/testing/selftests/drivers/net/ksft.h | 56 +++++++++++++
.../testing/selftests/drivers/net/napi_id.py | 23 +++++
.../selftests/drivers/net/napi_id_helper.c | 83 +++++++++++++++++++
.../selftests/drivers/net/xdp_helper.c | 49 +----------
7 files changed, 172 insertions(+), 48 deletions(-)
create mode 100644 tools/testing/selftests/drivers/net/ksft.h
create mode 100755 tools/testing/selftests/drivers/net/napi_id.py
create mode 100644 tools/testing/selftests/drivers/net/napi_id_helper.c
base-commit: cd7276ecac9c64c80433fbcff2e35aceaea6f477
--
2.43.0
Drivers that are told to allocate RX buffers from pools of DMA memory
should have enough memory in the pool to satisfy projected allocation
requests (a function of ring size, MTU & other parameters). If there's
not enough memory, RX ring refill might fail later at inconvenient times
(e.g. during NAPI poll).
This commit adds a check at dmabuf pool init time that compares the
amount of memory in the underlying chunk pool (configured by the user
space application providing dmabuf memory) with the desired pool size
(previously set by the driver) and fails with an error message if chunk
memory isn't enough.
Fixes: 0f9214046893 ("memory-provider: dmabuf devmem memory provider")
Signed-off-by: Cosmin Ratiu <cratiu(a)nvidia.com>
---
net/core/devmem.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 6e27a47d0493..651cd55ebb28 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -299,6 +299,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd,
int mp_dmabuf_devmem_init(struct page_pool *pool)
{
struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
+ size_t size;
if (!binding)
return -EINVAL;
@@ -312,6 +313,16 @@ int mp_dmabuf_devmem_init(struct page_pool *pool)
if (pool->p.order != 0)
return -E2BIG;
+ /* Validate that the underlying dmabuf has enough memory to satisfy
+ * requested pool size.
+ */
+ size = gen_pool_size(binding->chunk_pool) >> PAGE_SHIFT;
+ if (size < pool->p.pool_size) {
+ pr_warn("%s: Insufficient dmabuf memory (%zu pages) to satisfy pool_size (%u pages)\n",
+ __func__, size, pool->p.pool_size);
+ return -ENOMEM;
+ }
+
net_devmem_dmabuf_binding_get(binding);
return 0;
}
--
2.45.0
This patch series introduces the Hornet LSM. The goal of Hornet is to
provide a signature verification mechanism for eBPF programs.
eBPF has similar requirements to that of modules when it comes to
loading: find symbol addresses, fix up ELF relocations, some struct
field offset handling stuff called CO-RE (compile-once run-anywhere),
and some other miscellaneous bookkeeping. During eBPF program
compilation, pseudo-values get written to the immediate operands of
instructions. During loading, those pseudo-values get rewritten with
concrete addresses or data applicable to the currently running system,
e.g., a kallsyms address or an fd for a map. This needs to happen
before the instructions for a bpf program are loaded into the kernel
via the bpf() syscall. Unlike modules, an in-kernel loader
unfortunately doesn't exist. Typically, the instruction rewriting is
done dynamically in userspace via libbpf. Since the relocations and
instruction modifications are happening in userspace, and their values
may change depending upon the running system, this breaks known
signature verification mechanisms.
Light skeleton programs were introduced in order to support early
loading of eBPF programs along with user-mode drivers. They utilize a
separate eBPF program that can load a target eBPF program and perform
all necessary relocations in-kernel without needing a working
userspace. Light skeletons were mentioned as a possible path forward
for signature verification.
Hornet takes a simple approach to light-skeleton-based eBPF signature
verification. A PKCS#7 signature of a data buffer containing the raw
instructions of an eBPF program, followed by the initial values of any
maps used by the program is used. A utility script is provided to
parse and extract the contents of autogenerated header files created
via bpftool. That payload can then be signed and appended to the light
skeleton executable.
Maps are frozen to prevent TOCTOU bugs where a sufficiently privileged
user could rewrite map data between the calls to BPF_PROG_LOAD and
BPF_PROG_RUN. Additionally, both sparse-array-based and
fd_array_cnt-based map fd arrays are supported for signature
verification.
References:
[1] https://lore.kernel.org/bpf/20220209054315.73833-1-alexei.starovoitov@gmail…
[2] https://lore.kernel.org/bpf/CAADnVQ+wPK1KKZhCgb-Nnf0Xfjk8M1UpX5fnXC=cBzdEYb…
Change list:
- v1 -> v2
- Jargon clarification, maintainer entry and a few cosmetic fixes
Revisions:
- v1
https://lore.kernel.org/bpf/20250321164537.16719-1-bboscaccy@linux.microsof…
Blaise Boscaccy (4):
security: Hornet LSM
hornet: Introduce sign-ebpf
hornet: Add a light-skeleton data extactor script
selftests/hornet: Add a selftest for the Hornet LSM
Documentation/admin-guide/LSM/Hornet.rst | 53 +++
Documentation/admin-guide/LSM/index.rst | 1 +
MAINTAINERS | 9 +
crypto/asymmetric_keys/pkcs7_verify.c | 10 +
include/linux/kernel_read_file.h | 1 +
include/linux/verification.h | 1 +
include/uapi/linux/lsm.h | 1 +
scripts/Makefile | 1 +
scripts/hornet/Makefile | 5 +
scripts/hornet/extract-skel.sh | 29 ++
scripts/hornet/sign-ebpf.c | 411 +++++++++++++++++++
security/Kconfig | 3 +-
security/Makefile | 1 +
security/hornet/Kconfig | 11 +
security/hornet/Makefile | 4 +
security/hornet/hornet_lsm.c | 239 +++++++++++
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/hornet/Makefile | 51 +++
tools/testing/selftests/hornet/loader.c | 21 +
tools/testing/selftests/hornet/trivial.bpf.c | 33 ++
20 files changed, 885 insertions(+), 1 deletion(-)
create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
create mode 100644 scripts/hornet/Makefile
create mode 100755 scripts/hornet/extract-skel.sh
create mode 100644 scripts/hornet/sign-ebpf.c
create mode 100644 security/hornet/Kconfig
create mode 100644 security/hornet/Makefile
create mode 100644 security/hornet/hornet_lsm.c
create mode 100644 tools/testing/selftests/hornet/Makefile
create mode 100644 tools/testing/selftests/hornet/loader.c
create mode 100644 tools/testing/selftests/hornet/trivial.bpf.c
--
2.48.1
Hello,
this series is a revival of Xu Kuhoai's work to enable larger arguments
count for BPF programs on ARM64 ([1]). His initial series received some
positive feedback, but lacked some specific case handling around
arguments alignment (see AAPCS64 C.14 rule in section 6.8.2, [2]). There
as been another attempt from Puranjay Mohan, which was unfortunately
missing the same thing ([3]). Since there has been some time between
those series and this new one, I chose to send it as a new series
rather than a new revision of the existing series.
To support the increased argument counts and arguments larger than
registers size (eg: structures), the trampoline does the following:
- for bpf programs: arguments are retrieved from both registers and the
function stack, and pushed in the trampoline stack as an array of u64
to generate the programs context. It is then passed by pointer to the
bpf programs
- when the trampoline is in charge of calling the original function: it
restores the registers content, and generates a new stack layout for
the additional arguments that do not fit in registers.
This new attempt is based on Xu's series and aims to handle the
missing alignment concern raised in the reviews discussions. The main
novelties are then around arguments alignments:
- the first commit is exposing some new info in the BTF function model
passed to the JIT compiler to allow it to deduce the needed alignment
when configuring the trampoline stack
- the second commit is taken from Xu's series, and received the
following modifications:
- the calc_aux_args computes an expected alignment for each argument
- the calc_aux_args computes two different stack space sizes: the one
needed to store the bpf programs context, and the original function
stacked arguments (which needs alignment). Those stack sizes are in
bytes instead of "slots"
- when saving/restoring arguments for bpf program or for the original
function, make sure to align the load/store accordingly, when
relevant
- a few typos fixes and some rewording, raised by the review on the
original series
- the last commit introduces some explicit tests that ensure that the
needed alignment is enforced by the trampoline
I marked the series as RFC because it appears that the new tests trigger
some failures in CI on x86 and s390, despite the series not touching any
code related to those architectures. Some very early investigation/gdb
debugging on the x86 side seems to hint that it could be related to the
same missing alignment too (based on section 3.2.3 in [4], and so the
x86 trampoline would need the same alignment handling ?). For s390 it
looks less clear, as all values captured from the bpf test program are
set to 0 in the CI output, and I don't have the proper setup yet to
check the low level details. I am tempted to isolate those new tests
(which were actually useful to spot real issues while tuning the ARM64
trampoline) and add them to the relevant DENYLIST files for x86/s390,
but I guess this is not the right direction, so I would gladly take a
second opinion on this.
[1] https://lore.kernel.org/all/20230917150752.69612-1-xukuohai@huaweicloud.com…
[2] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#id82
[3] https://lore.kernel.org/bpf/20240705125336.46820-1-puranjay@kernel.org/
[4] https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore(a)bootlin.com>
---
Alexis Lothoré (eBPF Foundation) (3):
bpf: add struct largest member size in func model
bpf/selftests: add tests to validate proper arguments alignment on ARM64
bpf/selftests: enable tracing tests for ARM64
Xu Kuohai (1):
bpf, arm64: Support up to 12 function arguments
arch/arm64/net/bpf_jit_comp.c | 235 ++++++++++++++++-----
include/linux/bpf.h | 1 +
kernel/bpf/btf.c | 25 +++
tools/testing/selftests/bpf/DENYLIST.aarch64 | 3 -
.../selftests/bpf/prog_tests/tracing_struct.c | 23 ++
tools/testing/selftests/bpf/progs/tracing_struct.c | 10 +-
.../selftests/bpf/progs/tracing_struct_many_args.c | 67 ++++++
.../testing/selftests/bpf/test_kmods/bpf_testmod.c | 50 +++++
8 files changed, 357 insertions(+), 57 deletions(-)
---
base-commit: 91e7eb701b4bc389e7ddfd80ef6e82d1a6d2d368
change-id: 20250220-many_args_arm64-8bd3747e6948
Best regards,
--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
This is the initial import of a CAN selftest from can-tests[1] into the
tree. For now, it is just a single test but when agreed on the
structure, we intend to import more tests from can-tests and add
additional test cases.
The goal of moving the CAN selftests into the tree is to align the tests
more closely with the kernel, improve testing of CAN in general, and to
simplify running the tests automatically in the various kernel CI
systems.
I have cc'ed netdev and its reviewers and maintainers to make sure they
are okay with the location of the tests and the changes to the paths in
MAINTAINERS. The changes should be merged through linux-can-next and
subsequent changes will not go to netdev anymore.
[1]: https://github.com/linux-can/can-tests
Felix Maurer (4):
selftests: can: Import tst-filter from can-tests
selftests: can: use kselftest harness in test_raw_filter
selftests: can: Use fixtures in test_raw_filter
selftests: can: Document test_raw_filter test cases
MAINTAINERS | 2 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/net/can/.gitignore | 2 +
tools/testing/selftests/net/can/Makefile | 11 +
.../selftests/net/can/test_raw_filter.c | 338 ++++++++++++++++++
.../selftests/net/can/test_raw_filter.sh | 37 ++
6 files changed, 391 insertions(+)
create mode 100644 tools/testing/selftests/net/can/.gitignore
create mode 100644 tools/testing/selftests/net/can/Makefile
create mode 100644 tools/testing/selftests/net/can/test_raw_filter.c
create mode 100755 tools/testing/selftests/net/can/test_raw_filter.sh
--
2.49.0
The following set of commands:
ip link add br0 type bridge vlan_filtering 1 # vlan_default_pvid 1 is implicit
ip link set swp0 master br0
bridge vlan add dev swp0 vid 1
should result in the dropping of untagged and 802.1p-tagged traffic, but
we see that it continues to be accepted. Whereas, had we deleted VID 1
instead, the aforementioned dropping would have worked
This is because the ANA_PORT_DROP_CFG update logic doesn't run, because
ocelot_vlan_add() only calls ocelot_port_set_pvid() if the new VLAN has
the BRIDGE_VLAN_INFO_PVID flag.
Similar to other drivers like mt7530_port_vlan_add() which handle this
case correctly, we need to test whether the VLAN we're changing used to
have the BRIDGE_VLAN_INFO_PVID flag, but lost it now. That amounts to a
PVID deletion and should be treated as such.
Regarding blame attribution: this never worked properly since the
introduction of bridge VLAN filtering in commit 7142529f1688 ("net:
mscc: ocelot: add VLAN filtering"). However, there was a significant
paradigm shift which aligned the ANA_PORT_DROP_CFG register with the
PVID concept rather than with the native VLAN concept, and that change
wasn't targeted for 'stable'. Realistically, that is as far as this fix
needs to be propagated to.
Fixes: be0576fed6d3 ("net: mscc: ocelot: move the logic to drop 802.1p traffic to the pvid deletion")
Signed-off-by: Vladimir Oltean <vladimir.oltean(a)nxp.com>
---
drivers/net/ethernet/mscc/ocelot.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index ef93df520887..08bee56aea35 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -830,6 +830,7 @@ EXPORT_SYMBOL(ocelot_vlan_prepare);
int ocelot_vlan_add(struct ocelot *ocelot, int port, u16 vid, bool pvid,
bool untagged)
{
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
int err;
/* Ignore VID 0 added to our RX filter by the 8021q module, since
@@ -849,6 +850,11 @@ int ocelot_vlan_add(struct ocelot *ocelot, int port, u16 vid, bool pvid,
ocelot_bridge_vlan_find(ocelot, vid));
if (err)
return err;
+ } else if (ocelot_port->pvid_vlan &&
+ ocelot_bridge_vlan_find(ocelot, vid) == ocelot_port->pvid_vlan) {
+ err = ocelot_port_set_pvid(ocelot, port, NULL);
+ if (err)
+ return err;
}
/* Untagged egress vlan clasification */
--
2.43.0
The Automated Testing Summit (ATS) 2025 will be held as a co-located event at the Open Source Summit North America, and we’re now accepting talk proposals!
https://events.linuxfoundation.org/open-source-summit-north-america/feature…
📅 Date: June 26, 2025
📍 Location: Denver, CO, USA
Hosted by KernelCI, ATS is a technical summit focused on the challenges of testing and quality assurance in the Linux ecosystem — especially in upstream kernel development, embedded systems, cloud environments, and CI integration.
This is a great opportunity to share your work on:
* Kernel and userspace test frameworks
* Lab infrastructure and automation
* CI/CD pipelines for Linux
* Fuzzing, performance testing, and debugging tools
* Sharing and standardizing test results across systems
Whether you’re working on kernel testing, running tests on hardware labs, developing QA tools, or building infrastructure that scales across projects, ATS is the place to collaborate and move the ecosystem forward.
Submit your talk by May 18, 2025:
👉 Call for Proposals (CFP): https://sessionize.com/atsna2025
We hope to see you in Denver!
— The KernelCI Team