The AMD IOMMU documentation seems pretty clear that the V2 table follows the normal CPU expectation of sign extension. This is shown in
Figure 25: AMD64 Long Mode 4-Kbyte Page Address Translation
Where bits Sign-Extend [63:57] == [56]. This is typical for x86 which would have three regions in the page table: lower, non-canonical, upper.
The manual describes that the V1 table does not sign extend in section 2.2.4 Sharing AMD64 Processor and IOMMU Page Tables GPA-to-SPA
Further, Vasant has checked this and indicates the HW has an addtional behavior that the manual does not yet describe. The AMDv2 table does not have the sign extended behavior when attached to PASID 0, which may explain why this has gone unnoticed.
The iommu domain geometry does not directly support sign extended page tables. The driver should report only one of the lower/upper spaces. Solve this by removing the top VA bit from the geometry to use only the lower space.
This will also make the iommu_domain work consistently on all PASID 0 and PASID != 1.
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/iommu/amd/iommu.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
v2: - Revise the commit message and comment with the new information from Vasant. v1: https://patch.msgid.link/r/0-v1-6925ece6b623+296-amdv2_geo_jgg@nvidia.com
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 3117d99cf83d0d..1baa9d3583f369 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2526,8 +2526,21 @@ static inline u64 dma_max_address(enum protection_domain_mode pgtable) if (pgtable == PD_MODE_V1) return ~0ULL;
- /* V2 with 4/5 level page table */ - return ((1ULL << PM_LEVEL_SHIFT(amd_iommu_gpt_level)) - 1); + /* + * V2 with 4/5 level page table. Note that "2.2.6.5 AMD64 4-Kbyte Page + * Translation" shows that the V2 table sign extends the top of the + * address space creating a reserved region in the middle of the + * translation, just like the CPU does. Further Vasant says the docs are + * incomplete and this only applies to non-zero PASIDs. If the AMDv2 + * page table is assigned to the 0 PASID then there is no sign extension + * check. + * + * Since the IOMMU must have a fixed geometry, and the core code does + * not understand sign extended addressing, we have to chop off the high + * bit to get consistent behavior with attachments of the domain to any + * PASID. + */ + return ((1ULL << (PM_LEVEL_SHIFT(amd_iommu_gpt_level) - 1)) - 1); }
static bool amd_iommu_hd_support(struct amd_iommu *iommu)
base-commit: eb328711b15b17987021dbb674f446b7b008dca5
On 6/10/25 07:58, Jason Gunthorpe wrote:
The AMD IOMMU documentation seems pretty clear that the V2 table follows the normal CPU expectation of sign extension. This is shown in
Figure 25: AMD64 Long Mode 4-Kbyte Page Address Translation
Where bits Sign-Extend [63:57] == [56]. This is typical for x86 which would have three regions in the page table: lower, non-canonical, upper.
The manual describes that the V1 table does not sign extend in section 2.2.4 Sharing AMD64 Processor and IOMMU Page Tables GPA-to-SPA
Further, Vasant has checked this and indicates the HW has an addtional behavior that the manual does not yet describe. The AMDv2 table does not have the sign extended behavior when attached to PASID 0, which may explain why this has gone unnoticed.
The iommu domain geometry does not directly support sign extended page tables. The driver should report only one of the lower/upper spaces. Solve this by removing the top VA bit from the geometry to use only the lower space.
This will also make the iommu_domain work consistently on all PASID 0 and PASID != 1.
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link:https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpejgg@nvidia.com
Reviewed-by: Lu Baolu baolu.lu@linux.intel.com
On 6/10/2025 5:28 AM, Jason Gunthorpe wrote:
The AMD IOMMU documentation seems pretty clear that the V2 table follows the normal CPU expectation of sign extension. This is shown in
Figure 25: AMD64 Long Mode 4-Kbyte Page Address Translation
Where bits Sign-Extend [63:57] == [56]. This is typical for x86 which would have three regions in the page table: lower, non-canonical, upper.
The manual describes that the V1 table does not sign extend in section 2.2.4 Sharing AMD64 Processor and IOMMU Page Tables GPA-to-SPA
Further, Vasant has checked this and indicates the HW has an addtional behavior that the manual does not yet describe. The AMDv2 table does not have the sign extended behavior when attached to PASID 0, which may explain why this has gone unnoticed.
The iommu domain geometry does not directly support sign extended page tables. The driver should report only one of the lower/upper spaces. Solve this by removing the top VA bit from the geometry to use only the lower space.
This will also make the iommu_domain work consistently on all PASID 0 and PASID != 1.
You meant PASID != 0 ?
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Vasant Hegde vasant.hegde@amd.com
-Vasant
drivers/iommu/amd/iommu.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
v2:
- Revise the commit message and comment with the new information from Vasant.
v1: https://patch.msgid.link/r/0-v1-6925ece6b623+296-amdv2_geo_jgg@nvidia.com
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 3117d99cf83d0d..1baa9d3583f369 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2526,8 +2526,21 @@ static inline u64 dma_max_address(enum protection_domain_mode pgtable) if (pgtable == PD_MODE_V1) return ~0ULL;
- /* V2 with 4/5 level page table */
- return ((1ULL << PM_LEVEL_SHIFT(amd_iommu_gpt_level)) - 1);
- /*
* V2 with 4/5 level page table. Note that "2.2.6.5 AMD64 4-Kbyte Page
* Translation" shows that the V2 table sign extends the top of the
* address space creating a reserved region in the middle of the
* translation, just like the CPU does. Further Vasant says the docs are
* incomplete and this only applies to non-zero PASIDs. If the AMDv2
* page table is assigned to the 0 PASID then there is no sign extension
* check.
*
* Since the IOMMU must have a fixed geometry, and the core code does
* not understand sign extended addressing, we have to chop off the high
* bit to get consistent behavior with attachments of the domain to any
* PASID.
*/
- return ((1ULL << (PM_LEVEL_SHIFT(amd_iommu_gpt_level) - 1)) - 1);
} static bool amd_iommu_hd_support(struct amd_iommu *iommu)
base-commit: eb328711b15b17987021dbb674f446b7b008dca5
On Thu, Jun 12, 2025 at 10:54:00AM +0530, Vasant Hegde wrote:
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Vasant Hegde vasant.hegde@amd.com
Will, can you pick this up please? It seems to have been overlooked
Jason
Hi Jason, Vasant,
On Wed, Jul 16, 2025 at 09:49:29AM -0300, Jason Gunthorpe wrote:
On Thu, Jun 12, 2025 at 10:54:00AM +0530, Vasant Hegde wrote:
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Vasant Hegde vasant.hegde@amd.com
Will, can you pick this up please? It seems to have been overlooked
Thanks, I had missed this one (I only trawled the list for the two weeks prior to Joerg going on holiday).
I'll pick it up, but please now that the preceding:
if (pgtable == PD_MODE_V1)
part now returns:
PM_LEVEL_SIZE(amd_iommu_hpt_level);
instead of:
~0ULL;
thanks to 025d1371cc8c ("iommu/amd: Add efr[HATS] max v1 page table level"). I'm assuming that's fine because this change is about v2, but I just wanted to highlight it in case there's a potential issue.
Will
On Thu, Jul 17, 2025 at 11:03:56AM +0100, Will Deacon wrote:
thanks to 025d1371cc8c ("iommu/amd: Add efr[HATS] max v1 page table level"). I'm assuming that's fine because this change is about v2, but I just wanted to highlight it in case there's a potential issue.
Yes, this is only about V2, V1 is OK with the HATS changes
Thanks Jason
Hi Will,
On 7/17/2025 3:33 PM, Will Deacon wrote:
Hi Jason, Vasant,
On Wed, Jul 16, 2025 at 09:49:29AM -0300, Jason Gunthorpe wrote:
On Thu, Jun 12, 2025 at 10:54:00AM +0530, Vasant Hegde wrote:
Adjust dma_max_address() to remove the top VA bit. It now returns:
5 Level: Before 0x1ffffffffffffff After 0x0ffffffffffffff 4 Level: Before 0xffffffffffff After 0x7fffffffffff
Fixes: 11c439a19466 ("iommu/amd/pgtbl_v2: Fix domain max address") Link: https://lore.kernel.org/all/8858d4d6-d360-4ef0-935c-bfd13ea54f42@amd.com/ Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Vasant Hegde vasant.hegde@amd.com
Will, can you pick this up please? It seems to have been overlooked
Thanks, I had missed this one (I only trawled the list for the two weeks prior to Joerg going on holiday).
I'll pick it up, but please now that the preceding:
if (pgtable == PD_MODE_V1)
part now returns:
PM_LEVEL_SIZE(amd_iommu_hpt_level);
instead of:
~0ULL;
Right. Now it checks for supported level instead of assuming 64bit. This is correct.
thanks to 025d1371cc8c ("iommu/amd: Add efr[HATS] max v1 page table level"). I'm assuming that's fine because this change is about v2, but I just wanted to highlight it in case there's a potential issue.
Thanks. I did verify the tree and it looks good.
-Vasant
On Mon, 09 Jun 2025 20:58:05 -0300, Jason Gunthorpe wrote:
The AMD IOMMU documentation seems pretty clear that the V2 table follows the normal CPU expectation of sign extension. This is shown in
Figure 25: AMD64 Long Mode 4-Kbyte Page Address Translation
Where bits Sign-Extend [63:57] == [56]. This is typical for x86 which would have three regions in the page table: lower, non-canonical, upper.
[...]
Applied to iommu (amd/amd-vi), thanks!
[1/1] iommu/amd: Fix geometry.aperture_end for V2 tables https://git.kernel.org/iommu/c/8637afa79cfa
Cheers,
linux-stable-mirror@lists.linaro.org