Quoting Joerg Roedel (2020-08-21 11:22:04)
On Fri, Aug 21, 2020 at 10:54:22AM +0100, Chris Wilson wrote:
Ok. I thought it had to be after assigning the *ptep. If we apply the sync first, do not have to worry about PGTBL_PTE_MODIFIED from the *ptep?
Hmm, if I understand the code correctly, you are re-implementing some generic ioremap/vmalloc mapping logic in the i915 driver. I don't know the reason, but if it is valid you need to manually call arch_sync_kernel_mappings() from your driver like this to be correct:
if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_PTE_MODIFIED) arch_sync_kernel_mappings();
In practice this is a no-op, because nobody sets PGTBL_PTE_MODIFIED in ARCH_PAGE_TABLE_SYNC_MASK, so the above code would be optimized away.
But what you really care about is the tracking in apply_to_page_range(), as that allocates the !pte levels of your page-table, which needs synchronization on x86-32.
Btw, what are the reasons you can't use generic vmalloc/ioremap interfaces to map the range?
ioremap_prot and ioremap_page_range assume a contiguous IO address. So we needed to allocate the vmalloc area [and would then need to iterate over the discontiguous iomem chunks with ioremap_page_range], and since alloc_vm_area returned the ptep, it looked clearer to then assign those according to whether we wanted ioremapping or a plain page. So we ended up with one call to the core to return us a vm_struct and a pte array that worked for either backing store. -Chris