If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
Cc: stable@vger.kernel.org Fixes: 76612d6ce4cc ("mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()") Signed-off-by: Ralph Campbell rcampbell@nvidia.com Reported-by: Felix Kuehling felix.kuehling@amd.com --- mm/hmm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/mm/hmm.c b/mm/hmm.c index 3fd3242c5e50..7db2b29bdc85 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -273,6 +273,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, if (!non_swap_entry(entry)) goto fault;
+ if (is_device_private_entry(entry)) + goto fault; + if (is_device_exclusive_entry(entry)) goto fault;
On Fri, Jul 22, 2022 at 03:56:32PM -0700, Ralph Campbell wrote:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
Cc: stable@vger.kernel.org Fixes: 76612d6ce4cc ("mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()") Signed-off-by: Ralph Campbell rcampbell@nvidia.com Reported-by: Felix Kuehling felix.kuehling@amd.com
mm/hmm.c | 3 +++ 1 file changed, 3 insertions(+)
Should we have a test for this ?
Reviewed-by: Jason Gunthorpe jgg@nvidia.com
Jason
On 7/23/22 06:32, Jason Gunthorpe wrote:
On Fri, Jul 22, 2022 at 03:56:32PM -0700, Ralph Campbell wrote:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
Cc: stable@vger.kernel.org Fixes: 76612d6ce4cc ("mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()") Signed-off-by: Ralph Campbell rcampbell@nvidia.com Reported-by: Felix Kuehling felix.kuehling@amd.com
mm/hmm.c | 3 +++ 1 file changed, 3 insertions(+)
Should we have a test for this ?
Reviewed-by: Jason Gunthorpe jgg@nvidia.com
Jason
Sure, I'll add something to tools/testing/selftests/vm/hmm-tests.c
Ralph Campbell rcampbell@nvidia.com writes:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
/* * Never fault in device private pages, but just report * the PFN even if not present. */
This comment needs updating because it will be possible to fault in device private pages now.
It also looks a bit strange to be checking for device private entries twice - I think it would be clearer if hmm_is_device_private_entry() is removed and the ownership check done directly in hmm_vma_handle_pte().
- Alistair
Cc: stable@vger.kernel.org Fixes: 76612d6ce4cc ("mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()") Signed-off-by: Ralph Campbell rcampbell@nvidia.com Reported-by: Felix Kuehling felix.kuehling@amd.com
mm/hmm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/mm/hmm.c b/mm/hmm.c index 3fd3242c5e50..7db2b29bdc85 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -273,6 +273,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, if (!non_swap_entry(entry)) goto fault;
if (is_device_private_entry(entry))
goto fault;
- if (is_device_exclusive_entry(entry)) goto fault;
On 7/25/22 02:32, Alistair Popple wrote:
Ralph Campbell rcampbell@nvidia.com writes:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
/* * Never fault in device private pages, but just report * the PFN even if not present. */
This comment needs updating because it will be possible to fault in device private pages now.
It also looks a bit strange to be checking for device private entries twice - I think it would be clearer if hmm_is_device_private_entry() is removed and the ownership check done directly in hmm_vma_handle_pte().
- Alistair
I'll fix this in v2. Thanks for the review.
Am 2022-07-22 um 18:56 schrieb Ralph Campbell:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
Cc: stable@vger.kernel.org Fixes: 76612d6ce4cc ("mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()") Signed-off-by: Ralph Campbell rcampbell@nvidia.com Reported-by: Felix Kuehling felix.kuehling@amd.com
Acked-by: Felix Kuehling Felix.Kuehling@amd.com
Thank you!
mm/hmm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/mm/hmm.c b/mm/hmm.c index 3fd3242c5e50..7db2b29bdc85 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -273,6 +273,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, if (!non_swap_entry(entry)) goto fault;
if (is_device_private_entry(entry))
goto fault;
- if (is_device_exclusive_entry(entry)) goto fault;
On Fri, 22 Jul 2022 15:56:32 -0700 Ralph Campbell rcampbell@nvidia.com wrote:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
Could we please include here a description of the end-user visible effects of the bug?
Cc: stable@vger.kernel.org
Especially when proposing a -stable backport.
On 7/25/22 11:49, Andrew Morton wrote:
On Fri, 22 Jul 2022 15:56:32 -0700 Ralph Campbell rcampbell@nvidia.com wrote:
If hmm_range_fault() is called with the HMM_PFN_REQ_FAULT flag and a device private PTE is found, the hmm_range::dev_private_owner page is used to determine if the device private page should not be faulted in. However, if the device private page is not owned by the caller, hmm_range_fault() returns an error instead of calling migrate_to_ram() to fault in the page.
Could we please include here a description of the end-user visible effects of the bug?
Cc: stable@vger.kernel.org
Especially when proposing a -stable backport.
If I add the following, would that be sufficient? Should I post a v3?
For example, if a page is migrated to GPU private memory and a RDMA fault capable NIC tries to read the migrated page, without this patch it will get an error. With this patch, the page will be migrated back to system memory and the NIC will be able to read the data.
On Mon, 25 Jul 2022 12:07:27 -0700 Ralph Campbell rcampbell@nvidia.com wrote:
to fault in the page.
Could we please include here a description of the end-user visible effects of the bug?
Cc: stable@vger.kernel.org
Especially when proposing a -stable backport.
If I add the following, would that be sufficient? Should I post a v3?
For example, if a page is migrated to GPU private memory and a RDMA fault capable NIC tries to read the migrated page, without this patch it will get an error. With this patch, the page will be migrated back to system memory and the NIC will be able to read the data.
I added it, thanks.
linux-stable-mirror@lists.linaro.org