Fixup small broken window panes in DAMON selftests and kunit tests.
First four patches clean up DAMON debugfs interface selftests output, by
fixing segmentation fault of a test program (patch 1), removing
unnecessary debugging messages (patch 2), and hiding error messages from
expected failures (patches 3 and 4).
Following two patches fix copy-paste mistakes in DAMON Kconfig help
message that copied from debugfs kunit test (patch 5) and a comment on
the debugfs kunit test code (patch 6).
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Andrew Paniakin (1):
selftests/damon/huge_count_read_write: provide sufficiently large
buffer for DEPRECATED file read
SeongJae Park (5):
selftests/damon/huge_count_read_write: remove unnecessary debugging
message
selftests/damon/_debugfs_common: hide expected error message from
test_write_result()
selftests/damon/debugfs_duplicate_context_creation: hide errors from
expected file write failures
mm/damon/Kconfig: update DBGFS_KUNIT prompt copy for SYSFS_KUNIT
mm/damon/tests/dbgfs-kunit: fix the header double inclusion guarding
ifdef comment
mm/damon/Kconfig | 2 +-
mm/damon/tests/dbgfs-kunit.h | 2 +-
tools/testing/selftests/damon/_debugfs_common.sh | 7 ++++++-
.../selftests/damon/debugfs_duplicate_context_creation.sh | 2 +-
tools/testing/selftests/damon/huge_count_read_write.c | 4 +---
5 files changed, 10 insertions(+), 7 deletions(-)
base-commit: 13583c750117b4e10cdaf5578dcc7723b305ce4e
--
2.39.5
Two small fixes related to the MPTCP packets scheduler:
- Patch 1: add missing rcu_read_(un)lock(). A fix for >= 6.6.
- Patch 2: remove unneeded lock when listing packets schedulers. A fix
for >= 6.10.
And some modifications in the MPTCP selftests:
- Patch 3: a small addition to the MPTCP selftests to cover more code.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Matthieu Baerts (NGI0) (3):
mptcp: init: protect sched with rcu_read_lock
mptcp: remove unneeded lock when listing scheds
selftests: mptcp: list sysctl data
net/mptcp/protocol.c | 2 ++
net/mptcp/sched.c | 2 --
tools/testing/selftests/net/mptcp/mptcp_connect.sh | 9 +++++++++
3 files changed, 11 insertions(+), 2 deletions(-)
---
base-commit: 3b05b9c36ddd01338e1352588f2ec1ea23f97d43
change-id: 20241021-net-mptcp-sched-lock-10dfc75d1e00
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Userland library functions such as allocators and threading implementations
often require regions of memory to act as 'guard pages' - mappings which,
when accessed, result in a fatal signal being sent to the accessing
process.
The current means by which these are implemented is via a PROT_NONE mmap()
mapping, which provides the required semantics however incur an overhead of
a VMA for each such region.
With a great many processes and threads, this can rapidly add up and incur
a significant memory penalty. It also has the added problem of preventing
merges that might otherwise be permitted.
This series takes a different approach - an idea suggested by Vlasimil
Babka (and before him David Hildenbrand and Jann Horn - perhaps more - the
provenance becomes a little tricky to ascertain after this - please forgive
any omissions!) - rather than locating the guard pages at the VMA layer,
instead placing them in page tables mapping the required ranges.
Early testing of the prototype version of this code suggests a 5 times
speed up in memory mapping invocations (in conjunction with use of
process_madvise()) and a 13% reduction in VMAs on an entirely idle android
system and unoptimised code.
We expect with optimisation and a loaded system with a larger number of
guard pages this could significantly increase, but in any case these
numbers are encouraging.
This way, rather than having separate VMAs specifying which parts of a
range are guard pages, instead we have a VMA spanning the entire range of
memory a user is permitted to access and including ranges which are to be
'guarded'.
After mapping this, a user can specify which parts of the range should
result in a fatal signal when accessed.
By restricting the ability to specify guard pages to memory mapped by
existing VMAs, we can rely on the mappings being torn down when the
mappings are ultimately unmapped and everything works simply as if the
memory were not faulted in, from the point of view of the containing VMAs.
This mechanism in effect poisons memory ranges similar to hardware memory
poisoning, only it is an entirely software-controlled form of poisoning.
The mechanism is implemented via madvise() behaviour - MADV_GUARD_INSTALL
which installs page table-level guard page markers - and
MADV_GUARD_REMOVE - which clears them.
Guard markers can be installed across multiple VMAs and any existing
mappings will be cleared, that is zapped, before installing the guard page
markers in the page tables.
There is no concept of 'nested' guard markers, multiple attempts to install
guard markers in a range will, after the first attempt, have no effect.
Importantly, removing guard markers over a range that contains both guard
markers and ordinary backed memory has no effect on anything but the guard
markers (including leaving huge pages un-split), so a user can safely
remove guard markers over a range of memory leaving the rest intact.
The actual mechanism by which the page table entries are specified makes
use of existing logic - PTE markers, which are used for the userfaultfd
UFFDIO_POISON mechanism.
Unfortunately PTE_MARKER_POISONED is not suited for the guard page
mechanism as it results in VM_FAULT_HWPOISON semantics in the fault
handler, so we add our own specific PTE_MARKER_GUARD and adapt existing
logic to handle it.
We also extend the generic page walk mechanism to allow for installation of
PTEs (carefully restricted to memory management logic only to prevent
unwanted abuse).
We ensure that zapping performed by MADV_DONTNEED and MADV_FREE do not
remove guard markers, nor does forking (except when VM_WIPEONFORK is
specified for a VMA which implies a total removal of memory
characteristics).
It's important to note that the guard page implementation is emphatically
NOT a security feature, so a user can remove the markers if they wish. We
simply implement it in such a way as to provide the least surprising
behaviour.
An extensive set of self-tests are provided which ensure behaviour is as
expected and additionally self-documents expected behaviour of guard
ranges.
Suggested-by: Vlastimil Babka <vbabka(a)suse.cz>
Suggested-by: Jann Horn <jannh(a)google.com>
Suggested-by: David Hildenbrand <david(a)redhat.com>
v3
* Cleaned up mm/pagewalk.c logic a bit to make things clearer, as suggested
by Vlastiml.
* Explicitly avoid splitting THP on PTE installation, as suggested by
Vlastimil. Note this has no impact on the guard pages logic, which has
page table entry handlers at PUD, PMD and PTE level.
* Added WARN_ON_ONCE() to mm/hugetlb.c path where we don't expect a guard
marker, as suggested by Vlastimil.
* Reverted change to is_poisoned_swp_entry() to exclude guard pages which
has the effect of MADV_FREE _not_ clearing guard pages. After discussion
with Vlastimil, it became apparent that the ability to 'cancel' the
freeing operation by writing to the mapping after having issued an
MADV_FREE would mean that we would risk unexpected behaviour should the
guard pages be removed, so we now do not remove markers here at all.
* Added comment to PTE_MARKER_GUARD to highlight that memory tagged with
the marker behaves as if it were a region mapped PROT_NONE, as
highlighted by David.
* Rename poison -> install, unpoison -> remove (i.e. MADV_GUARD_INSTALL /
MADV_GUARD_REMOVE over MADV_GUARD_POISON / MADV_GUARD_REMOVE) at the
request of David and John who both find the poison analogy
confusing/overloaded.
* After a lot of discussion, replace the looping behaviour should page
faults race with guard page installation with a modest reattempt followed
by returning -ERESTARTNOINTR to have the operation abort and re-enter,
relieving lock contention and avoiding the possibility of allowing a
malicious sandboxed process to impact the mmap lock or stall the overall
process more than necessary, as suggested by Jann and Vlastimil having
raised the issue.
* Adjusted the page table walker so a populated huge PUD or PMD is
correctly treated as being populated, necessitating a zap. In v2 we
incorrectly skipped over these, which would cause the logic to wrongly
proceed as if nothing were populated and the install succeeded.
Instead, explicitly check to see if a huge page - if so, do not split but
rather abort the operation and let zap take care of things.
* Updated the guard remove logic to not unnecessarily split huge pages
either.
* Added a debug check to assert that the number of installed PTEs matches
expectation, accounting for any existing guard pages.
* Adapted vector_madvise() used by the process_madvise() system call to
handle -ERESTARTNOINTR correctly.
v2
* The macros in kselftest_harness.h seem to be broken - __EXPECT() is
terminated by '} while (0); OPTIONAL_HANDLER(_assert)' meaning it is not
safe in single line if / else or for /which blocks, however working
around this results in checkpatch producing invalid warnings, as reported
by Shuah.
* Fixing these macros is out of scope for this series, so compromise and
instead rewrite test blocks so as to use multiple lines by separating out
a decl in most cases. This has the side effect of, for the most part,
making things more readable.
* Heavily document the use of the volatile keyword - we can't avoid
checkpatch complaining about this, so we explain it, as reported by
Shuah.
* Updated commit message to highlight that we skip tests we lack
permissions for, as reported by Shuah.
* Replaced a perror() with ksft_exit_fail_perror(), as reported by Shuah.
* Added user friendly messages to cases where tests are skipped due to lack
of permissions, as reported by Shuah.
* Update the tool header to include the new MADV_GUARD_POISON/UNPOISON
defines and directly include asm-generic/mman.h to get the
platform-neutral versions to ensure we import them.
* Finally fixed Vlastimil's email address in Suggested-by tags from suze to
suse, as reported by Vlastimil.
* Added linux-api to cc list, as reported by Vlastimil.
https://lore.kernel.org/all/cover.1729440856.git.lorenzo.stoakes@oracle.com/
v1
* Un-RFC'd as appears no major objections to approach but rather debate on
implementation.
* Fixed issue with arches which need mmu_context.h and
tlbfush.h. header imports in pagewalker logic to be able to use
update_mmu_cache() as reported by the kernel test bot.
* Added comments in page walker logic to clarify who can use
ops->install_pte and why as well as adding a check_ops_valid() helper
function, as suggested by Christoph.
* Pass false in full parameter in pte_clear_not_present_full() as suggested
by Jann.
* Stopped erroneously requiring a write lock for the poison operation as
suggested by Jann and Suren.
* Moved anon_vma_prepare() to the start of madvise_guard_poison() to be
consistent with how this is used elsewhere in the kernel as suggested by
Jann.
* Avoid returning -EAGAIN if we are raced on page faults, just keep looping
and duck out if a fatal signal is pending or a conditional reschedule is
needed, as suggested by Jann.
* Avoid needlessly splitting huge PUDs and PMDs by specifying
ACTION_CONTINUE, as suggested by Jann.
https://lore.kernel.org/all/cover.1729196871.git.lorenzo.stoakes@oracle.com/
RFC
https://lore.kernel.org/all/cover.1727440966.git.lorenzo.stoakes@oracle.com/
Lorenzo Stoakes (5):
mm: pagewalk: add the ability to install PTEs
mm: add PTE_MARKER_GUARD PTE marker
mm: madvise: implement lightweight guard page mechanism
tools: testing: update tools UAPI header for mman-common.h
selftests/mm: add self tests for guard page feature
arch/alpha/include/uapi/asm/mman.h | 3 +
arch/mips/include/uapi/asm/mman.h | 3 +
arch/parisc/include/uapi/asm/mman.h | 3 +
arch/xtensa/include/uapi/asm/mman.h | 3 +
include/linux/mm_inline.h | 2 +-
include/linux/pagewalk.h | 18 +-
include/linux/swapops.h | 24 +-
include/uapi/asm-generic/mman-common.h | 3 +
mm/hugetlb.c | 4 +
mm/internal.h | 12 +
mm/madvise.c | 225 ++++
mm/memory.c | 18 +-
mm/mprotect.c | 6 +-
mm/mseal.c | 1 +
mm/pagewalk.c | 227 +++-
tools/include/uapi/asm-generic/mman-common.h | 3 +
tools/testing/selftests/mm/.gitignore | 1 +
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/guard-pages.c | 1239 ++++++++++++++++++
19 files changed, 1720 insertions(+), 76 deletions(-)
create mode 100644 tools/testing/selftests/mm/guard-pages.c
--
2.47.0
The following kselftest arm64 and FVP failed with Linux next-20241025 on
- Qemu-arm64
- FVP
running Linux next-20241025 kernel.
First seen on next-20241025
Good: next-20241024
BAD: next-20241025
kselftest-arm64, FVP
* arm64_check_buffer_fill
* arm64_check_mmap_options
* arm64_check_child_memory
Anyone have seen these failures ?
Reported-by: Linux Kernel Functional Testing <lkft(a)linaro.org>
Test log:
----------
# selftests: arm64: check_buffer_fill
# 1..20
# ok 1 Check buffer correctness by byte with sync err mode and mmap memory
# ok 2 Check buffer correctness by byte with async err mode and mmap memory
# ok 3 Check buffer correctness by byte with sync err mode and
mmap/mprotect memory
# ok 4 Check buffer correctness by byte with async err mode and
mmap/mprotect memory
# ok 5 Check buffer write underflow by byte with sync mode and mmap memory
# ok 6 Check buffer write underflow by byte with async mode and mmap memory
# ok 7 Check buffer write underflow by byte with tag check fault
ignore and mmap memory
# ok 8 Check buffer write underflow by byte with sync mode and mmap memory
# ok 9 Check buffer write underflow by byte with async mode and mmap memory
# ok 10 Check buffer write underflow by byte with tag check fault
ignore and mmap memory
# ok 11 Check buffer write overflow by byte with sync mode and mmap memory
# ok 12 Check buffer write overflow by byte with async mode and mmap memory
# ok 13 Check buffer write overflow by byte with tag fault ignore mode
and mmap memory
# ok 14 Check buffer write correctness by block with sync mode and mmap memory
# ok 15 Check buffer write correctness by block with async mode and mmap memory
# ok 16 Check buffer write correctness by block with tag fault ignore
and mmap memory
# # FAIL: mmap allocation
# # FAIL: memory allocation
# not ok 17 Check initial tags with private mapping, sync error mode
and mmap memory
# ok 18 Check initial tags with private mapping, sync error mode and
mmap/mprotect memory
# # FAIL: mmap allocation
# # FAIL: memory allocation
# not ok 19 Check initial tags with shared mapping, sync error mode
and mmap memory
# ok 20 Check initial tags with shared mapping, sync error mode and
mmap/mprotect memory
# # Totals: pass:18 fail:2 xfail:0 xpass:0 skip:0 error:0
not ok 21 selftests: arm64: check_buffer_fill # exit=1
# timeout set to 45
# selftests: arm64: check_child_memory
# 1..12
# ok 1 Check child anonymous memory with private mapping, precise mode
and mmap memory
# ok 2 Check child anonymous memory with shared mapping, precise mode
and mmap memory
# ok 3 Check child anonymous memory with private mapping, imprecise
mode and mmap memory
# ok 4 Check child anonymous memory with shared mapping, imprecise
mode and mmap memory
# ok 5 Check child anonymous memory with private mapping, precise mode
and mmap/mprotect memory
# ok 6 Check child anonymous memory with shared mapping, precise mode
and mmap/mprotect memory
# # FAIL: mmap allocation
# # FAIL: memory allocation
# not ok 7 Check child file memory with private mapping, precise mode
and mmap memory
# # FAIL: mmap allocation
# # FAIL: memory allocation
# not ok 8 Check child file memory with shared mapping, precise mode
and mmap memory
# ok 9 Check child file memory with private mapping, imprecise mode
and mmap memory
# ok 10 Check child file memory with shared mapping, imprecise mode
and mmap memory
# ok 11 Check child file memory with private mapping, precise mode and
mmap/mprotect memory
# ok 12 Check child file memory with shared mapping, precise mode and
mmap/mprotect memory
# # Totals: pass:10 fail:2 xfail:0 xpass:0 skip:0 error:0
not ok 22 selftests: arm64: check_child_memory # exit=1
boot Log links,
--------
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te…
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te…
Test results history:
----------
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te…
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20241028/te…
metadata:
----
git describe: next-20241028
git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
git sha: dec9255a128e19c5fcc3bdb18175d78094cc624d
kernel config:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2o3tMqzOtHXYQjlvfR5t…
build url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2o3tMqzOtHXYQjlvfR5t…
toolchain: gcc-13
Steps to reproduce:
---------
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2o3tON5kNi…
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2o3tON5kNi…
--
Linaro LKFT
https://lkft.linaro.org
As the part-2 of the VIOMMU infrastructure, this series introduces a VIRQ
object after repurposing the existing FAULT object, which provides a nice
notification pathway to the user space already. So, the first thing to do
is reworking the FAULT object.
Mimicing the HWPT structures, add a common EVENT structure to support its
derivatives: EVENT_IOPF (the prior FAULT object) and EVENT_VIRQ (new one).
IOMMUFD_CMD_VIRQ_ALLOC is introduced to allocate EVENT_VIRQ for a VIOMMU.
One VIOMMU can have multiple VIRQs in different types but can not support
multiple VIRQs with the same types.
Drivers might need the VIOMMU's vdev_id list or the exact vdev_id link of
the passthrough device's to forward IRQs/events via the VIOMMU framework.
Thus, extend the set/unset_vdev_id ioctls down to the driver using VIOMMU
ops. This allows drivers to take the control of a vdev_id's lifecycle.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID. So, there comes with some helpers for
drivers to use.
As usual, this series comes with the selftest coverage for this new VIRQ,
and with a real world use case in the ARM SMMUv3 driver.
This must be based on the VIOMMU Part-1 series. It's on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_virq-v1
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_virq-v1
Thanks!
Nicolin
Nicolin Chen (10):
iommufd: Rename IOMMUFD_OBJ_FAULT to IOMMUFD_OBJ_EVENT_IOPF
iommufd: Rename fault.c to event.c
iommufd: Add IOMMUFD_OBJ_EVENT_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC
iommufd/viommu: Allow drivers to control vdev_id lifecycle
iommufd/viommu: Add iommufd_vdev_id_to_dev helper
iommufd/viommu: Add iommufd_viommu_report_irq helper
iommufd/selftest: Implement mock_viommu_set/unset_vdev_id
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VIRQ for VIRQ coverage
iommufd/selftest: Add EVENT_VIRQ test coverage
iommu/arm-smmu-v3: Report virtual IRQ for device in user space
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 109 +++-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/iommufd/device.c | 2 +
drivers/iommu/iommufd/event.c | 613 ++++++++++++++++++
drivers/iommu/iommufd/fault.c | 443 -------------
drivers/iommu/iommufd/hw_pagetable.c | 12 +-
drivers/iommu/iommufd/iommufd_private.h | 147 ++++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
drivers/iommu/iommufd/main.c | 13 +-
drivers/iommu/iommufd/selftest.c | 66 ++
drivers/iommu/iommufd/viommu.c | 25 +-
drivers/iommu/iommufd/viommu_api.c | 54 ++
include/linux/iommufd.h | 28 +
include/uapi/linux/iommufd.h | 46 ++
tools/testing/selftests/iommu/iommufd.c | 11 +
tools/testing/selftests/iommu/iommufd_utils.h | 64 ++
17 files changed, 1130 insertions(+), 517 deletions(-)
create mode 100644 drivers/iommu/iommufd/event.c
delete mode 100644 drivers/iommu/iommufd/fault.c
--
2.43.0
This series introduces a new vIOMMU infrastructure and related ioctls.
IOMMUFD has been using the HWPT infrastructure for all cases, including a
nested IO page table support. Yet, there're limitations for an HWPT-based
structure to support some advanced HW-accelerated features, such as CMDQV
on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-IOMMU
environment, it is not straightforward for nested HWPTs to share the same
parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
parent HWPT typically hold one stage-2 IO pagetable and tag it with only
one ID in the cache entries. When sharing one large stage-2 IO pagetable
across physical IOMMU instances, that one ID may not always be available
across all the IOMMU instances. In other word, it's ideal for SW to have
a different container for the stage-2 IO pagetable so it can hold another
ID that's available.
For this "different container", add vIOMMU, an additional layer to hold
extra virtualization information:
_______________________________________________________________________
| iommufd (with vIOMMU) |
| |
| [5] |
| _____________ |
| | | |
| |----------------| vIOMMU | |
| | | | |
| | | | |
| | [1] | | [4] [2] |
| | ______ | | _____________ ________ |
| | | | | [3] | | | | | |
| | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
| | |______| |_____________| |_____________| |________| |
| | | | | | |
|______|________|______________|__________________|_______________|_____|
| | | | |
______v_____ | ______v_____ ______v_____ ___v__
| struct | | PFN | (paging) | | (nested) | |struct|
|iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device|
|____________| storage|____________| |____________| |______|
The vIOMMU object should be seen as a slice of a physical IOMMU instance
that is passed to or shared with a VM. That can be some HW/SW resources:
- Security namespace for guest owned ID, e.g. guest-controlled cache tags
- Access to a sharable nesting parent pagetable across physical IOMMUs
- Virtualization of various platforms IDs, e.g. RIDs and others
- Delivery of paravirtualized invalidation
- Direct assigned invalidation queues
- Direct assigned interrupts
- Non-affiliated event reporting
On a multi-IOMMU system, the vIOMMU object must be instanced to the number
of the physical IOMMUs that are passed to (via devices) a guest VM, while
being able to hold the shareable parent HWPT. Each vIOMMU then just needs
to allocate its own individual ID to tag its own cache:
----------------------------
---------------- | | paging_hwpt0 |
| hwpt_nested0 |--->| viommu0 ------------------
---------------- | | IDx |
----------------------------
----------------------------
---------------- | | paging_hwpt0 |
| hwpt_nested1 |--->| viommu1 ------------------
---------------- | | IDy |
----------------------------
As an initial part-1, add IOMMUFD_CMD_VIOMMU_ALLOC ioctl for an allocation
only. And implement it in arm-smmu-v3 driver as a real world use case.
More vIOMMU-based structs and ioctls will be introduced in the follow-up
series to support vDEVICE, vIRQ (vEVENT) and vQUEUE objects. Although we
repurposed the vIOMMU object from an earlier RFC, just for a referece:
https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/
This series is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p1-v4
(paring QEMU branch for testing will be provided with the part2 series)
Changelog
v4
* Added "Reviewed-by" from Jason
* Dropped IOMMU_VIOMMU_TYPE_DEFAULT support
* Dropped iommufd_object_alloc_elm renamings
* Renamed iommufd's viommu_api.c to driver.c
* Reworked iommufd_viommu_alloc helper
* Added a separate iommufd_hwpt_nested_alloc_for_viommu function for
hwpt_nested allocations on a vIOMMU, and added comparison between
viommu->iommu_dev->ops and dev_iommu_ops(idev->dev)
* Replaced s2_parent with vsmmu in arm_smmu_nested_domain
* Replaced domain_alloc_user in iommu_ops with domain_alloc_nested in
viommu_ops
* Replaced wait_queue_head_t with a completion, to delay the unplug of
mock_iommu_dev
* Corrected documentation graph that was missing struct iommu_device
* Added an iommufd_verify_unfinalized_object helper to verify driver-
allocated vIOMMU/vDEVICE objects
* Added missing test cases for TEST_LENGTH and fail_nth
v3
https://lore.kernel.org/all/cover.1728491453.git.nicolinc@nvidia.com/
* Rebased on top of Jason's nesting v3 series
https://lore.kernel.org/all/0-v3-e2e16cd7467f+2a6a1-smmuv3_nesting_jgg@nvid…
* Split the series into smaller parts
* Added Jason's Reviewed-by
* Added back viommu->iommu_dev
* Added support for driver-allocated vIOMMU v.s. core-allocated
* Dropped arm_smmu_cache_invalidate_user
* Added an iommufd_test_wait_for_users() in selftest
* Reworked test code to make viommu an individual FIXTURE
* Added missing TEST_LENGTH case for the new ioctl command
v2
https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
* Limited vdev_id to one per idev
* Added a rw_sem to protect the vdev_id list
* Reworked driver-level APIs with proper lockings
* Added a new viommu_api file for IOMMUFD_DRIVER config
* Dropped useless iommu_dev point from the viommu structure
* Added missing index numnbers to new types in the uAPI header
* Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one
* Reworked mock_viommu_cache_invalidate() using the new iommu helper
* Reordered details of set/unset_vdev_id handlers for proper lockings
v1
https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (11):
iommufd: Move struct iommufd_object to public iommufd header
iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
iommufd: Add iommufd_verify_unfinalized_object
iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
iommufd/selftest: Add refcount to mock_iommu_device
iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST
iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
Documentation: userspace-api: iommufd: Update vIOMMU
iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
drivers/iommu/iommufd/Makefile | 5 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +++---
drivers/iommu/iommufd/iommufd_private.h | 36 ++------
drivers/iommu/iommufd/iommufd_test.h | 2 +
include/linux/iommu.h | 14 +++
include/linux/iommufd.h | 89 +++++++++++++++++++
include/uapi/linux/iommufd.h | 56 ++++++++++--
tools/testing/selftests/iommu/iommufd_utils.h | 28 ++++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 79 ++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +-
drivers/iommu/iommufd/driver.c | 38 ++++++++
drivers/iommu/iommufd/hw_pagetable.c | 69 +++++++++++++-
drivers/iommu/iommufd/main.c | 58 ++++++------
drivers/iommu/iommufd/selftest.c | 73 +++++++++++++--
drivers/iommu/iommufd/viommu.c | 85 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd.c | 78 ++++++++++++++++
.../selftests/iommu/iommufd_fail_nth.c | 11 +++
Documentation/userspace-api/iommufd.rst | 69 +++++++++++++-
18 files changed, 701 insertions(+), 124 deletions(-)
create mode 100644 drivers/iommu/iommufd/driver.c
create mode 100644 drivers/iommu/iommufd/viommu.c
--
2.43.0