Hi Jerry,
On 2022/8/9 00:21, Jerry Snitselaar wrote:
On Mon, Aug 08, 2022 at 11:46:12AM +0800, Lu Baolu wrote:
The translation table copying code for kdump kernels is currently based on the extended root/context entry formats of ECS mode defined in older VT-d v2.5, and doesn't handle the scalable mode formats. This causes the kexec capture kernel boot failure with DMAR faults if the IOMMU was enabled in scalable mode by the previous kernel.
The ECS mode has already been deprecated by the VT-d spec since v3.0 and Intel IOMMU driver doesn't support this mode as there's no real hardware implementation. Hence this converts ECS checking in copying table code into scalable mode.
The existing copying code consumes a bit in the context entry as a mark of copied entry. This marker needs to work for the old format as well as for extended context entries. It's hard to find such a bit for both legacy and scalable mode context entries. This replaces it with a per- IOMMU bitmap.
Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support") Cc:stable@vger.kernel.org Reported-by: Jerry Snitselaarjsnitsel@redhat.com Tested-by: Wen Jinwen.jin@intel.com Signed-off-by: Lu Baolubaolu.lu@linux.intel.com
I did a quick test last night, and it was able to harvest the vmcore, and boot back up. Before you mentioned part of the issue being that it couldn't get to the PGTT field in the pasid table entry. Was that not the case,
It is the case from IOMMU hardware point of view.
or is it looking at the old kernel pasid dir entries and table entries through the pasid dir pointer in the copied context entry?
Yes. It reuses the pasid table in old kernel and replaces it until the new device driver takes over and starts the first DMA operation.
Best regards, baolu