Hi Jason,
On Mon, May 05, 2025 at 11:18:33AM -0300, Jason Gunthorpe wrote:
+static int pt_iommu_init_domain(struct pt_iommu *iommu_table,
struct iommu_domain *domain)
+{
- struct pt_common *common = common_from_iommu(iommu_table);
- struct pt_iommu_info info;
- struct pt_range range;
- NS(get_info)(iommu_table, &info);
- domain->type = __IOMMU_DOMAIN_PAGING;
- domain->pgsize_bitmap = info.pgsize_bitmap;
- if (pt_feature(common, PT_FEAT_DYNAMIC_TOP))
range = _pt_top_range(common,
_pt_top_set(NULL, PT_MAX_TOP_LEVEL));
- else
range = pt_top_range(common);
- /*
* A 64 bit high address space table on a 32 bit system cannot work.
*/
- domain->geometry.aperture_start = (unsigned long)range.va;
- if ((pt_vaddr_t)domain->geometry.aperture_start != range.va ||
range.va > ULONG_MAX)
return -EOVERFLOW;
- /*
* The aperture is limited to what the API can do after considering all
* the different types dma_addr_t/unsigned long/pt_vaddr_t that are used
* to store a VA. Set the aperture to something that is valid for all
* cases. Saturate instead of truncate the end if the types are smaller
* than the top range. aperture_end is a last.
*/
- domain->geometry.aperture_end = (unsigned long)range.last_va;
I am experiencing a system hang with a 5-level v2 page table mode, on boot. The NVMe boot drive is not initializing. Below are the relevant dmesg logs with some prints i had added:
[ 6.386439] AMD-Vi v2 domain init [ 6.390132] AMD-Vi v2 pt init [ 6.390133] AMD-Vi aperture end last va ffffffffffffff ... [ 10.315372] AMD-Vi gen pt MAP PAGES iova ffffffffffffe000 paddr 19351b000 ... [ 72.171930] nvme nvme0: I/O tag 0 (0000) QID 0 timeout, disable controller [ 72.179618] nvme nvme1: I/O tag 24 (0018) QID 0 timeout, disable controller [ 72.197176] nvme nvme0: Identify Controller failed (-4) [ 72.203063] nvme nvme1: Identify Controller failed (-4) [ 72.209237] nvme 0000:05:00.0: probe with driver nvme failed with error -5 [ 72.209336] nvme 0000:44:00.0: probe with driver nvme failed with error -5 ... Timed out waiting for the udev queue to be empty.
According to the dmesg logs above, the IOVA for the v2 page table appears incorrect and is not aligned with domain->geometry.aperture_end. Which requires domain->geometry.force_aperture = true; to be added at the appropriate location. Proabably here!
- Ankit
- if ((pt_vaddr_t)domain->geometry.aperture_end != range.last_va) {
domain->geometry.aperture_end = ULONG_MAX;
domain->pgsize_bitmap &= ULONG_MAX;
- }
- return 0;
+}