August 2024 - Linux-stable-mirror

FAILED: patch "[PATCH] drm/amdkfd: don't allow mapping the MMIO HDP page with large" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 24e82654e98e96cece5d8b919c522054456eeec6 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081201-approach-power-0318@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: 24e82654e98e ("drm/amdkfd: don't allow mapping the MMIO HDP page with large pages") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 24e82654e98e96cece5d8b919c522054456eeec6 Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher(a)amd.com> Date: Sun, 14 Apr 2024 13:06:39 -0400 Subject: [PATCH] drm/amdkfd: don't allow mapping the MMIO HDP page with large pages We don't get the right offset in that case. The GPU has an unused 4K area of the register BAR space into which you can remap registers. We remap the HDP flush registers into this space to allow userspace (CPU or GPU) to flush the HDP when it updates VRAM. However, on systems with >4K pages, we end up exposing PAGE_SIZE of MMIO space. Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space") Reviewed-by: Felix Kuehling <felix.kuehling(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 6b713fb0b818..fdf171ad4a3c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1144,7 +1144,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, goto err_unlock; } offset = dev->adev->rmmio_remap.bus_addr; - if (!offset) { + if (!offset || (PAGE_SIZE > 4096)) { err = -ENOMEM; goto err_unlock; } @@ -2312,7 +2312,7 @@ static int criu_restore_memory_of_gpu(struct kfd_process_device *pdd, return -EINVAL; } offset = pdd->dev->adev->rmmio_remap.bus_addr; - if (!offset) { + if (!offset || (PAGE_SIZE > 4096)) { pr_err("amdgpu_amdkfd_get_mmio_remap_phys_addr failed\n"); return -ENOMEM; } @@ -3354,6 +3354,9 @@ static int kfd_mmio_mmap(struct kfd_node *dev, struct kfd_process *process, if (vma->vm_end - vma->vm_start != PAGE_SIZE) return -EINVAL; + if (PAGE_SIZE > 4096) + return -EINVAL; + address = dev->adev->rmmio_remap.bus_addr; vm_flags_set(vma, VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE |

10 months, 3 weeks

1
0
0 0

FAILED: patch "[PATCH] drm/amdkfd: don't allow mapping the MMIO HDP page with large" failed to apply to 6.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y git checkout FETCH_HEAD git cherry-pick -x 24e82654e98e96cece5d8b919c522054456eeec6 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081200-overuse-imaginary-f30a@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. Possible dependencies: 24e82654e98e ("drm/amdkfd: don't allow mapping the MMIO HDP page with large pages") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 24e82654e98e96cece5d8b919c522054456eeec6 Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher(a)amd.com> Date: Sun, 14 Apr 2024 13:06:39 -0400 Subject: [PATCH] drm/amdkfd: don't allow mapping the MMIO HDP page with large pages We don't get the right offset in that case. The GPU has an unused 4K area of the register BAR space into which you can remap registers. We remap the HDP flush registers into this space to allow userspace (CPU or GPU) to flush the HDP when it updates VRAM. However, on systems with >4K pages, we end up exposing PAGE_SIZE of MMIO space. Fixes: d8e408a82704 ("drm/amdkfd: Expose HDP registers to user space") Reviewed-by: Felix Kuehling <felix.kuehling(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 6b713fb0b818..fdf171ad4a3c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1144,7 +1144,7 @@ static int kfd_ioctl_alloc_memory_of_gpu(struct file *filep, goto err_unlock; } offset = dev->adev->rmmio_remap.bus_addr; - if (!offset) { + if (!offset || (PAGE_SIZE > 4096)) { err = -ENOMEM; goto err_unlock; } @@ -2312,7 +2312,7 @@ static int criu_restore_memory_of_gpu(struct kfd_process_device *pdd, return -EINVAL; } offset = pdd->dev->adev->rmmio_remap.bus_addr; - if (!offset) { + if (!offset || (PAGE_SIZE > 4096)) { pr_err("amdgpu_amdkfd_get_mmio_remap_phys_addr failed\n"); return -ENOMEM; } @@ -3354,6 +3354,9 @@ static int kfd_mmio_mmap(struct kfd_node *dev, struct kfd_process *process, if (vma->vm_end - vma->vm_start != PAGE_SIZE) return -EINVAL; + if (PAGE_SIZE > 4096) + return -EINVAL; + address = dev->adev->rmmio_remap.bus_addr; vm_flags_set(vma, VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE |

10 months, 3 weeks

1
0
0 0

FAILED: patch "[PATCH] drm/amd: Fix shutdown (again) on some SMU v13.0.4/11" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x ff4e49f446ed24772182c724e0ef1a5be23c622a # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081249-derail-recovery-86ca@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: ff4e49f446ed ("drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms") b911505e6ba4 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") fedb6ae49758 ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From ff4e49f446ed24772182c724e0ef1a5be23c622a Mon Sep 17 00:00:00 2001 From: Mario Limonciello <mario.limonciello(a)amd.com> Date: Sun, 26 May 2024 07:59:08 -0500 Subject: [PATCH] drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms commit cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues that were reported since commit 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") but caused issues for some people. Adjust the workaround flow to properly only apply in the S4 case: -> For shutdown go through SMU_MSG_PrepareMp1ForUnload -> For S4 go through SMU_MSG_GfxDeviceDriverReset and SMU_MSG_PrepareMp1ForUnload Reported-and-tested-by: lectrode <electrodexsnet(a)gmail.com> Closes: https://github.com/void-linux/void-packages/issues/50417 Cc: stable(a)vger.kernel.org Fixes: cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") Reviewed-by: Tim Huang <Tim.Huang(a)amd.com> Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c index dfc76f6b468f..b081ae3e8f43 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c @@ -226,15 +226,17 @@ static int smu_v13_0_4_system_features_control(struct smu_context *smu, bool en) struct amdgpu_device *adev = smu->adev; int ret = 0; - if (!en && adev->in_s4) { - /* Adds a GFX reset as workaround just before sending the - * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering - * an invalid state. - */ - ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, - SMU_RESET_MODE_2, NULL); - if (ret) - return ret; + if (!en && !adev->in_s0ix) { + if (adev->in_s4) { + /* Adds a GFX reset as workaround just before sending the + * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering + * an invalid state. + */ + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, + SMU_RESET_MODE_2, NULL); + if (ret) + return ret; + } ret = smu_cmn_send_smc_msg(smu, SMU_MSG_PrepareMp1ForUnload, NULL); }

10 months, 3 weeks

1
0
0 0

FAILED: patch "[PATCH] drm/amd: Fix shutdown (again) on some SMU v13.0.4/11" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x ff4e49f446ed24772182c724e0ef1a5be23c622a # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081248-evident-fever-c73b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: ff4e49f446ed ("drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms") b911505e6ba4 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") fedb6ae49758 ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From ff4e49f446ed24772182c724e0ef1a5be23c622a Mon Sep 17 00:00:00 2001 From: Mario Limonciello <mario.limonciello(a)amd.com> Date: Sun, 26 May 2024 07:59:08 -0500 Subject: [PATCH] drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms commit cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues that were reported since commit 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") but caused issues for some people. Adjust the workaround flow to properly only apply in the S4 case: -> For shutdown go through SMU_MSG_PrepareMp1ForUnload -> For S4 go through SMU_MSG_GfxDeviceDriverReset and SMU_MSG_PrepareMp1ForUnload Reported-and-tested-by: lectrode <electrodexsnet(a)gmail.com> Closes: https://github.com/void-linux/void-packages/issues/50417 Cc: stable(a)vger.kernel.org Fixes: cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") Reviewed-by: Tim Huang <Tim.Huang(a)amd.com> Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c index dfc76f6b468f..b081ae3e8f43 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c @@ -226,15 +226,17 @@ static int smu_v13_0_4_system_features_control(struct smu_context *smu, bool en) struct amdgpu_device *adev = smu->adev; int ret = 0; - if (!en && adev->in_s4) { - /* Adds a GFX reset as workaround just before sending the - * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering - * an invalid state. - */ - ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, - SMU_RESET_MODE_2, NULL); - if (ret) - return ret; + if (!en && !adev->in_s0ix) { + if (adev->in_s4) { + /* Adds a GFX reset as workaround just before sending the + * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering + * an invalid state. + */ + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, + SMU_RESET_MODE_2, NULL); + if (ret) + return ret; + } ret = smu_cmn_send_smc_msg(smu, SMU_MSG_PrepareMp1ForUnload, NULL); }

10 months, 3 weeks

1
0
0 0

FAILED: patch "[PATCH] drm/amd: Fix shutdown (again) on some SMU v13.0.4/11" failed to apply to 6.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y git checkout FETCH_HEAD git cherry-pick -x ff4e49f446ed24772182c724e0ef1a5be23c622a # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081247-smolder-deputize-5350@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. Possible dependencies: ff4e49f446ed ("drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms") b911505e6ba4 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From ff4e49f446ed24772182c724e0ef1a5be23c622a Mon Sep 17 00:00:00 2001 From: Mario Limonciello <mario.limonciello(a)amd.com> Date: Sun, 26 May 2024 07:59:08 -0500 Subject: [PATCH] drm/amd: Fix shutdown (again) on some SMU v13.0.4/11 platforms commit cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues that were reported since commit 31729e8c21ec ("drm/amd/pm: fixes a random hang in S4 for SMU v13.0.4/11") but caused issues for some people. Adjust the workaround flow to properly only apply in the S4 case: -> For shutdown go through SMU_MSG_PrepareMp1ForUnload -> For S4 go through SMU_MSG_GfxDeviceDriverReset and SMU_MSG_PrepareMp1ForUnload Reported-and-tested-by: lectrode <electrodexsnet(a)gmail.com> Closes: https://github.com/void-linux/void-packages/issues/50417 Cc: stable(a)vger.kernel.org Fixes: cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users") Reviewed-by: Tim Huang <Tim.Huang(a)amd.com> Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c index dfc76f6b468f..b081ae3e8f43 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c @@ -226,15 +226,17 @@ static int smu_v13_0_4_system_features_control(struct smu_context *smu, bool en) struct amdgpu_device *adev = smu->adev; int ret = 0; - if (!en && adev->in_s4) { - /* Adds a GFX reset as workaround just before sending the - * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering - * an invalid state. - */ - ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, - SMU_RESET_MODE_2, NULL); - if (ret) - return ret; + if (!en && !adev->in_s0ix) { + if (adev->in_s4) { + /* Adds a GFX reset as workaround just before sending the + * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering + * an invalid state. + */ + ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset, + SMU_RESET_MODE_2, NULL); + if (ret) + return ret; + } ret = smu_cmn_send_smc_msg(smu, SMU_MSG_PrepareMp1ForUnload, NULL); }

10 months, 3 weeks

1
0
0 0

FAILED: patch "[PATCH] drm/amdgpu: fix locking scope when flushing tlb" failed to apply to 6.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y git checkout FETCH_HEAD git cherry-pick -x 9c33e5fd4fb63b793d9a92bf35d190630d9bada4 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024080738-tarmac-unproven-1f45@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. Possible dependencies: 9c33e5fd4fb6 ("drm/amdgpu: fix locking scope when flushing tlb") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 9c33e5fd4fb63b793d9a92bf35d190630d9bada4 Mon Sep 17 00:00:00 2001 From: Yunxiang Li <Yunxiang.Li(a)amd.com> Date: Thu, 23 May 2024 07:48:19 -0400 Subject: [PATCH] drm/amdgpu: fix locking scope when flushing tlb MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Which method is used to flush tlb does not depend on whether a reset is in progress or not. We should skip flush altogether if the GPU will get reset. So put both path under reset_domain read lock. Signed-off-by: Yunxiang Li <Yunxiang.Li(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> CC: stable(a)vger.kernel.org diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 660599823050..322b8ff67cde 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -682,12 +682,17 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid, struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring; struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst]; unsigned int ndw; - signed long r; + int r; uint32_t seq; - if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready || - !down_read_trylock(&adev->reset_domain->sem)) { + /* + * A GPU reset should flush all TLBs anyway, so no need to do + * this while one is ongoing. + */ + if (!down_read_trylock(&adev->reset_domain->sem)) + return 0; + if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready) { if (adev->gmc.flush_tlb_needs_extra_type_2) adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid, 2, all_hub, @@ -701,44 +706,41 @@ int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid, adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid, flush_type, all_hub, inst); - return 0; - } + r = 0; + } else { + /* 2 dwords flush + 8 dwords fence */ + ndw = kiq->pmf->invalidate_tlbs_size + 8; - /* 2 dwords flush + 8 dwords fence */ - ndw = kiq->pmf->invalidate_tlbs_size + 8; + if (adev->gmc.flush_tlb_needs_extra_type_2) + ndw += kiq->pmf->invalidate_tlbs_size; - if (adev->gmc.flush_tlb_needs_extra_type_2) - ndw += kiq->pmf->invalidate_tlbs_size; + if (adev->gmc.flush_tlb_needs_extra_type_0) + ndw += kiq->pmf->invalidate_tlbs_size; - if (adev->gmc.flush_tlb_needs_extra_type_0) - ndw += kiq->pmf->invalidate_tlbs_size; + spin_lock(&adev->gfx.kiq[inst].ring_lock); + amdgpu_ring_alloc(ring, ndw); + if (adev->gmc.flush_tlb_needs_extra_type_2) + kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 2, all_hub); - spin_lock(&adev->gfx.kiq[inst].ring_lock); - amdgpu_ring_alloc(ring, ndw); - if (adev->gmc.flush_tlb_needs_extra_type_2) - kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 2, all_hub); + if (flush_type == 2 && adev->gmc.flush_tlb_needs_extra_type_0) + kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 0, all_hub); - if (flush_type == 2 && adev->gmc.flush_tlb_needs_extra_type_0) - kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 0, all_hub); + kiq->pmf->kiq_invalidate_tlbs(ring, pasid, flush_type, all_hub); + r = amdgpu_fence_emit_polling(ring, &seq, MAX_KIQ_REG_WAIT); + if (r) { + amdgpu_ring_undo(ring); + spin_unlock(&adev->gfx.kiq[inst].ring_lock); + goto error_unlock_reset; + } - kiq->pmf->kiq_invalidate_tlbs(ring, pasid, flush_type, all_hub); - r = amdgpu_fence_emit_polling(ring, &seq, MAX_KIQ_REG_WAIT); - if (r) { - amdgpu_ring_undo(ring); + amdgpu_ring_commit(ring); spin_unlock(&adev->gfx.kiq[inst].ring_lock); - goto error_unlock_reset; + if (amdgpu_fence_wait_polling(ring, seq, usec_timeout) < 1) { + dev_err(adev->dev, "timeout waiting for kiq fence\n"); + r = -ETIME; + } } - amdgpu_ring_commit(ring); - spin_unlock(&adev->gfx.kiq[inst].ring_lock); - r = amdgpu_fence_wait_polling(ring, seq, usec_timeout); - if (r < 1) { - dev_err(adev->dev, "wait for kiq fence error: %ld.\n", r); - r = -ETIME; - goto error_unlock_reset; - } - r = 0; - error_unlock_reset: up_read(&adev->reset_domain->sem); return r;

10 months, 3 weeks

2
4
0 0

[PATCH 0/1] xfs: fix log recovery buffer allocation for the

by Kevin Berry

Hi folks, in order to fix CVE-2024-39472 in kernels 5.15, 6.1, and 6.6, I have adapted the mainline patch 45cf976008dd (xfs: fix log recovery buffer allocation for the legacy h_size fixup) to resolve conflicts with those kernels. Specifically, the mainline patch uses kvfree, but the amended patch uses kmem_free since kmem_free was used in xfs until patch 49292576136f (xfs: convert kmem_free() for kvmalloc users to kvfree()). I tested the patch by applying it to the above kernels and recompiling them. I also ran xfstests on the 6.6.43 kernel with the patch applied. In my initial xfstests run, all xfs and generic tests passed except for generic/082, generic/269, generic/627, and xfs/155, but those tests all passed on a second run. I'm assuming those initial failures were unrelated to this patch, unless someone more familiar with those tests thinks otherwise. I'd be more than happy to do any more verification or tests if they're required. Thanks! Christoph Hellwig (1): xfs: fix log recovery buffer allocation for the legacy h_size fixup fs/xfs/xfs_log_recover.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) base-commit: 58b0425ff5df680d0b67f64ae1f3f1ebdf1c4de9 -- 2.46.0.76.ge559c4bf1a-goog

10 months, 3 weeks

2
2
0 0

[PATCH 6.1 0/2]

by Bart Van Assche

Hi Greg, The mq-deadline 'async_depth' sysfs attribute doesn't behave as intended. This patch series fixes the implementation of that attribute. The original patch series is available here: https://lore.kernel.org/linux-block/20240509170149.7639-1-bvanassche@acm.or… Thanks, Bart. Bart Van Assche (2): block: Call .limit_depth() after .hctx has been set block/mq-deadline: Fix the tag reservation code block/blk-mq.c | 6 +++++- block/mq-deadline.c | 20 +++++++++++++++++--- 2 files changed, 22 insertions(+), 4 deletions(-)

10 months, 3 weeks

2
3
0 0

FAILED: patch "[PATCH] nouveau: set placement to original placement on uvmm" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 9c685f61722d30a22d55bb8a48f7a48bb2e19bcc # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024080722-overlying-unmasking-3eb7@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: 9c685f61722d ("nouveau: set placement to original placement on uvmm validate.") 014f831abcb8 ("drm/nouveau: use GPUVM common infrastructure") 94bc2249f08e ("drm/gpuvm: add an abstraction for a VM / BO combination") 8af72338dd81 ("drm/gpuvm: reference count drm_gpuvm structures") 266f7618e761 ("drm/nouveau: separately allocate struct nouveau_uvmm") 809ef191ee60 ("drm/gpuvm: add drm_gpuvm_flags to drm_gpuvm") 6118411428a3 ("drm/nouveau: make use of the GPUVM's shared dma-resv") bbe8458037e7 ("drm/gpuvm: add common dma-resv per struct drm_gpuvm") b41e297abd23 ("drm/nouveau: make use of drm_gpuvm_range_valid()") 546ca4d35dcc ("drm/gpuvm: convert WARN() to drm_WARN() variants") 78f54469b871 ("drm/nouveau: uvmm: rename 'umgr' to 'base'") f72c2db47080 ("drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 9c685f61722d30a22d55bb8a48f7a48bb2e19bcc Mon Sep 17 00:00:00 2001 From: Dave Airlie <airlied(a)redhat.com> Date: Wed, 15 May 2024 12:55:41 +1000 Subject: [PATCH] nouveau: set placement to original placement on uvmm validate. When a buffer is evicted for memory pressure or TTM evict all, the placement is set to the eviction domain, this means the buffer never gets revalidated on the next exec to the correct domain. I think this should be fine to use the initial domain from the object creation, as least with VM_BIND this won't change after init so this should be the correct answer. Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI") Cc: Danilo Krummrich <dakr(a)redhat.com> Cc: <stable(a)vger.kernel.org> # v6.6 Signed-off-by: Dave Airlie <airlied(a)redhat.com> Signed-off-by: Danilo Krummrich <dakr(a)kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240515025542.2156774-1-airl… diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c index 9402fa320a7e..48f105239f42 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -1803,6 +1803,7 @@ nouveau_uvmm_bo_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec) { struct nouveau_bo *nvbo = nouveau_gem_object(vm_bo->obj); + nouveau_bo_placement_set(nvbo, nvbo->valid_domains, 0); return nouveau_bo_validate(nvbo, true, false); }

10 months, 3 weeks

3
2
0 0

FAILED: patch "[PATCH] mm/hugetlb: fix potential race in" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 5596d9e8b553dacb0ac34bcf873cbbfb16c3ba3e # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024071559-unroasted-trapper-8b66@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: 5596d9e8b553 ("mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio()") bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") 51718e25c53f ("mm: convert arch_clear_hugepage_flags to take a folio") 831bc31a5e82 ("mm: hugetlb: improve the handling of hugetlb allocation failure for freed or in-use hugetlb") ebc20dcac4ce ("mm: hugetlb_vmemmap: convert page to folio") c5ad3233ead5 ("hugetlb_vmemmap: use folio argument for hugetlb_vmemmap_* functions") c24f188b2289 ("hugetlb: batch TLB flushes when restoring vmemmap") f13b83fdd996 ("hugetlb: batch TLB flushes when freeing vmemmap") f4b7e3efaddb ("hugetlb: batch PMD split for bulk vmemmap dedup") 91f386bf0772 ("hugetlb: batch freeing of vmemmap pages") cfb8c75099db ("hugetlb: perform vmemmap restoration on a list of pages") 79359d6d24df ("hugetlb: perform vmemmap optimization on a list of pages") d67e32f26713 ("hugetlb: restructure pool allocations") d2cf88c27f51 ("hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles") 30a89adf872d ("hugetlb: check for hugetlb folio before vmemmap_restore") d5b43e9683ec ("hugetlb: convert remove_pool_huge_page() to remove_pool_hugetlb_folio()") 04bbfd844b99 ("hugetlb: remove a few calls to page_folio()") fde1c4ecf916 ("mm: hugetlb: skip initialization of gigantic tail struct pages if freed by HVO") 3ee0aa9f0675 ("mm: move some shrinker-related function declarations to mm/internal.h") d8f5f7e445f0 ("hugetlb: set hugetlb page flag before optimizing vmemmap") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 5596d9e8b553dacb0ac34bcf873cbbfb16c3ba3e Mon Sep 17 00:00:00 2001 From: Miaohe Lin <linmiaohe(a)huawei.com> Date: Mon, 8 Jul 2024 10:51:27 +0800 Subject: [PATCH] mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio() There is a potential race between __update_and_free_hugetlb_folio() and try_memory_failure_hugetlb(): CPU1 CPU2 __update_and_free_hugetlb_folio try_memory_failure_hugetlb folio_test_hugetlb -- It's still hugetlb folio. folio_clear_hugetlb_hwpoison spin_lock_irq(&hugetlb_lock); __get_huge_page_for_hwpoison folio_set_hugetlb_hwpoison spin_unlock_irq(&hugetlb_lock); spin_lock_irq(&hugetlb_lock); __folio_clear_hugetlb(folio); -- Hugetlb flag is cleared but too late. spin_unlock_irq(&hugetlb_lock); When the above race occurs, raw error page info will be leaked. Even worse, raw error pages won't have hwpoisoned flag set and hit pcplists/buddy. Fix this issue by deferring folio_clear_hugetlb_hwpoison() until __folio_clear_hugetlb() is done. So all raw error pages will have hwpoisoned flag set. Link: https://lkml.kernel.org/r/20240708025127.107713-1-linmiaohe@huawei.com Fixes: 32c877191e02 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Acked-by: Muchun Song <muchun.song(a)linux.dev> Reviewed-by: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2afb70171b76..fe44324d6383 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1725,13 +1725,6 @@ static void __update_and_free_hugetlb_folio(struct hstate *h, return; } - /* - * Move PageHWPoison flag from head page to the raw error pages, - * which makes any healthy subpages reusable. - */ - if (unlikely(folio_test_hwpoison(folio))) - folio_clear_hugetlb_hwpoison(folio); - /* * If vmemmap pages were allocated above, then we need to clear the * hugetlb flag under the hugetlb lock. @@ -1742,6 +1735,13 @@ static void __update_and_free_hugetlb_folio(struct hstate *h, spin_unlock_irq(&hugetlb_lock); } + /* + * Move PageHWPoison flag from head page to the raw error pages, + * which makes any healthy subpages reusable. + */ + if (unlikely(folio_test_hwpoison(folio))) + folio_clear_hugetlb_hwpoison(folio); + folio_ref_unfreeze(folio, 1); /*

10 months, 3 weeks

3
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2024