[PATCH AUTOSEL 5.14 01/21] thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int

List overview All Threads
Download

newer

older

[PATCH AUTOSEL 5.10 1/8]...

✅ PASS: Test report for kernel...

Sasha Levin

17 Sep 2021 17 Sep '21

2:32 a.m.

From: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se

[ Upstream commit d3a2328e741bf6e9e6bda750e0a63832fa365a74 ]

The TSC id and number of TSC ids should be stored as unsigned int as they can't be negative. Fix the datatype of the loop counter 'i' and rcar_gen3_thermal_tsc.id to reflect this.

Signed-off-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20210804091818.2196806-3-niklas.soderlund+renesas@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thermal/rcar_gen3_thermal.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c index fdf16aa34eb4..702696cf58b6 100644 --- a/drivers/thermal/rcar_gen3_thermal.c +++ b/drivers/thermal/rcar_gen3_thermal.c @@ -84,7 +84,7 @@ struct rcar_gen3_thermal_tsc { struct thermal_zone_device *zone; struct equation_coefs coef; int tj_t; - int id; /* thermal channel id */ + unsigned int id; /* thermal channel id */ };

struct rcar_gen3_thermal_priv { @@ -310,7 +310,8 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev) const int *ths_tj_1 = of_device_get_match_data(dev); struct resource *res; struct thermal_zone_device *zone; - int ret, i; + unsigned int i; + int ret;

/* default values if FUSEs are missing */ /* TODO: Read values from hardware on supported platforms */ @@ -376,7 +377,7 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev) if (ret < 0) goto error_unregister;

- dev_info(dev, "TSC%d: Loaded %d trip points\n", i, ret); + dev_info(dev, "TSC%u: Loaded %d trip points\n", i, ret); }

priv->num_tscs = i;

-- 2.30.2

Show replies by date

Sasha Levin

17 Sep 17 Sep

2:32 a.m.

New subject: [PATCH AUTOSEL 5.14 02/21] habanalabs: fix nullifying of destroyed mmu pgt pool

From: Tomer Tayar ttayar@habana.ai

[ Upstream commit 89aad770d692e4d2d9a604c1674e9dfa69421430 ]

In case of host-resident MMU, when the page tables pool is destroyed, its pointer is not nullified correctly. As a result, on a device fini which happens after a failing reset, the already destroyed pool is accessed, which leads to a kernel panic. The patch fixes the setting of the pool pointer to NULL.

Signed-off-by: Tomer Tayar ttayar@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/mmu/mmu_v1.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/habanalabs/common/mmu/mmu_v1.c b/drivers/misc/habanalabs/common/mmu/mmu_v1.c index c5e93ff32586..0f536f79dd9c 100644 --- a/drivers/misc/habanalabs/common/mmu/mmu_v1.c +++ b/drivers/misc/habanalabs/common/mmu/mmu_v1.c @@ -470,13 +470,13 @@ static void hl_mmu_v1_fini(struct hl_device *hdev) if (!ZERO_OR_NULL_PTR(hdev->mmu_priv.hr.mmu_shadow_hop0)) { kvfree(hdev->mmu_priv.dr.mmu_shadow_hop0); gen_pool_destroy(hdev->mmu_priv.dr.mmu_pgt_pool); - }

- /* Make sure that if we arrive here again without init was called we - * won't cause kernel panic. This can happen for example if we fail - * during hard reset code at certain points - */ - hdev->mmu_priv.dr.mmu_shadow_hop0 = NULL; + /* Make sure that if we arrive here again without init was + * called we won't cause kernel panic. This can happen for + * example if we fail during hard reset code at certain points + */ + hdev->mmu_priv.dr.mmu_shadow_hop0 = NULL; + } }

/**

-- 2.30.2

Sasha Levin

2:32 a.m.

New subject: [PATCH AUTOSEL 5.14 03/21] habanalabs: fix race between soft reset and heartbeat

From: Koby Elbaz kelbaz@habana.ai

[ Upstream commit 8bb8b505761238be0d6a83dc41188867d65e5d4c ]

There is a scenario where an ongoing soft reset would race with an ongoing heartbeat routine, eventually causing heartbeat to fail and thus to escalate into a hard reset.

With this fix, soft-reset procedure will disable heartbeat CPU messages and flush the (ongoing) current one before continuing with reset code.

Signed-off-by: Koby Elbaz kelbaz@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/device.c | 53 +++++++++++++++----- drivers/misc/habanalabs/common/firmware_if.c | 18 +++++-- drivers/misc/habanalabs/common/habanalabs.h | 4 +- drivers/misc/habanalabs/common/hw_queue.c | 30 ++++------- 4 files changed, 67 insertions(+), 38 deletions(-)

diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c index ff4cbde289c0..0a788a13f2c1 100644 --- a/drivers/misc/habanalabs/common/device.c +++ b/drivers/misc/habanalabs/common/device.c @@ -682,6 +682,44 @@ int hl_device_set_debug_mode(struct hl_device *hdev, bool enable) return rc; }

+static void take_release_locks(struct hl_device *hdev) +{ + /* Flush anyone that is inside the critical section of enqueue + * jobs to the H/W + */ + hdev->asic_funcs->hw_queues_lock(hdev); + hdev->asic_funcs->hw_queues_unlock(hdev); + + /* Flush processes that are sending message to CPU */ + mutex_lock(&hdev->send_cpu_message_lock); + mutex_unlock(&hdev->send_cpu_message_lock); + + /* Flush anyone that is inside device open */ + mutex_lock(&hdev->fpriv_list_lock); + mutex_unlock(&hdev->fpriv_list_lock); +} + +static void cleanup_resources(struct hl_device *hdev, bool hard_reset) +{ + if (hard_reset) + device_late_fini(hdev); + + /* + * Halt the engines and disable interrupts so we won't get any more + * completions from H/W and we won't have any accesses from the + * H/W to the host machine + */ + hdev->asic_funcs->halt_engines(hdev, hard_reset); + + /* Go over all the queues, release all CS and their jobs */ + hl_cs_rollback_all(hdev); + + /* Release all pending user interrupts, each pending user interrupt + * holds a reference to user context + */ + hl_release_pending_user_interrupts(hdev); +} + /* * hl_device_suspend - initiate device suspend * @@ -707,16 +745,7 @@ int hl_device_suspend(struct hl_device *hdev) /* This blocks all other stuff that is not blocked by in_reset */ hdev->disabled = true;

- /* - * Flush anyone that is inside the critical section of enqueue - * jobs to the H/W - */ - hdev->asic_funcs->hw_queues_lock(hdev); - hdev->asic_funcs->hw_queues_unlock(hdev); - - /* Flush processes that are sending message to CPU */ - mutex_lock(&hdev->send_cpu_message_lock); - mutex_unlock(&hdev->send_cpu_message_lock); + take_release_locks(hdev);

rc = hdev->asic_funcs->suspend(hdev); if (rc) @@ -894,8 +923,8 @@ int hl_device_reset(struct hl_device *hdev, u32 flags) return 0; }

- hard_reset = (flags & HL_RESET_HARD) != 0; - from_hard_reset_thread = (flags & HL_RESET_FROM_RESET_THREAD) != 0; + hard_reset = !!(flags & HL_RESET_HARD); + from_hard_reset_thread = !!(flags & HL_RESET_FROM_RESET_THREAD);

if (!hard_reset && !hdev->supports_soft_reset) { hard_instead_soft = true; diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c index 2e4d04ec6b53..653e8f5ef6ac 100644 --- a/drivers/misc/habanalabs/common/firmware_if.c +++ b/drivers/misc/habanalabs/common/firmware_if.c @@ -240,11 +240,15 @@ int hl_fw_send_cpu_message(struct hl_device *hdev, u32 hw_queue_id, u32 *msg, /* set fence to a non valid value */ pkt->fence = cpu_to_le32(UINT_MAX);

- rc = hl_hw_queue_send_cb_no_cmpl(hdev, hw_queue_id, len, pkt_dma_addr); - if (rc) { - dev_err(hdev->dev, "Failed to send CB on CPU PQ (%d)\n", rc); - goto out; - } + /* + * The CPU queue is a synchronous queue with an effective depth of + * a single entry (although it is allocated with room for multiple + * entries). We lock on it using 'send_cpu_message_lock' which + * serializes accesses to the CPU queue. + * Which means that we don't need to lock the access to the entire H/W + * queues module when submitting a JOB to the CPU queue. + */ + hl_hw_queue_submit_bd(hdev, queue, 0, len, pkt_dma_addr);

if (prop->fw_app_cpu_boot_dev_sts0 & CPU_BOOT_DEV_STS0_PKT_PI_ACK_EN) expected_ack_val = queue->pi; @@ -2235,6 +2239,10 @@ static int hl_fw_dynamic_init_cpu(struct hl_device *hdev, dev_info(hdev->dev, "Loading firmware to device, may take some time...\n");

+ /* + * In this stage, "cpu_dyn_regs" contains only LKD's hard coded values! + * It will be updated from FW after hl_fw_dynamic_request_descriptor(). + */ dyn_regs = &fw_loader->dynamic_loader.comm_desc.cpu_dyn_regs;

rc = hl_fw_dynamic_send_protocol_cmd(hdev, fw_loader, COMMS_RST_STATE, diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h index 6b3cdd7e068a..c63e26da5135 100644 --- a/drivers/misc/habanalabs/common/habanalabs.h +++ b/drivers/misc/habanalabs/common/habanalabs.h @@ -2436,7 +2436,9 @@ void destroy_hdev(struct hl_device *hdev); int hl_hw_queues_create(struct hl_device *hdev); void hl_hw_queues_destroy(struct hl_device *hdev); int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, - u32 cb_size, u64 cb_ptr); + u32 cb_size, u64 cb_ptr); +void hl_hw_queue_submit_bd(struct hl_device *hdev, struct hl_hw_queue *q, + u32 ctl, u32 len, u64 ptr); int hl_hw_queue_schedule_cs(struct hl_cs *cs); u32 hl_hw_queue_add_ptr(u32 ptr, u16 val); void hl_hw_queue_inc_ci_kernel(struct hl_device *hdev, u32 hw_queue_id); diff --git a/drivers/misc/habanalabs/common/hw_queue.c b/drivers/misc/habanalabs/common/hw_queue.c index bcabfdbf1e01..0afead229e97 100644 --- a/drivers/misc/habanalabs/common/hw_queue.c +++ b/drivers/misc/habanalabs/common/hw_queue.c @@ -65,7 +65,7 @@ void hl_hw_queue_update_ci(struct hl_cs *cs) }

/* - * ext_and_hw_queue_submit_bd() - Submit a buffer descriptor to an external or a + * hl_hw_queue_submit_bd() - Submit a buffer descriptor to an external or a * H/W queue. * @hdev: pointer to habanalabs device structure * @q: pointer to habanalabs queue structure @@ -80,8 +80,8 @@ void hl_hw_queue_update_ci(struct hl_cs *cs) * This function must be called when the scheduler mutex is taken * */ -static void ext_and_hw_queue_submit_bd(struct hl_device *hdev, - struct hl_hw_queue *q, u32 ctl, u32 len, u64 ptr) +void hl_hw_queue_submit_bd(struct hl_device *hdev, struct hl_hw_queue *q, + u32 ctl, u32 len, u64 ptr) { struct hl_bd *bd;

@@ -222,8 +222,8 @@ static int hw_queue_sanity_checks(struct hl_device *hdev, struct hl_hw_queue *q, * @cb_size: size of CB * @cb_ptr: pointer to CB location * - * This function sends a single CB, that must NOT generate a completion entry - * + * This function sends a single CB, that must NOT generate a completion entry. + * Sending CPU messages can be done instead via 'hl_hw_queue_submit_bd()' */ int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, u32 cb_size, u64 cb_ptr) @@ -231,16 +231,7 @@ int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, struct hl_hw_queue *q = &hdev->kernel_queues[hw_queue_id]; int rc = 0;

- /* - * The CPU queue is a synchronous queue with an effective depth of - * a single entry (although it is allocated with room for multiple - * entries). Therefore, there is a different lock, called - * send_cpu_message_lock, that serializes accesses to the CPU queue. - * As a result, we don't need to lock the access to the entire H/W - * queues module when submitting a JOB to the CPU queue - */ - if (q->queue_type != QUEUE_TYPE_CPU) - hdev->asic_funcs->hw_queues_lock(hdev); + hdev->asic_funcs->hw_queues_lock(hdev);

if (hdev->disabled) { rc = -EPERM; @@ -258,11 +249,10 @@ int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, goto out; }

- ext_and_hw_queue_submit_bd(hdev, q, 0, cb_size, cb_ptr); + hl_hw_queue_submit_bd(hdev, q, 0, cb_size, cb_ptr);

out: - if (q->queue_type != QUEUE_TYPE_CPU) - hdev->asic_funcs->hw_queues_unlock(hdev); + hdev->asic_funcs->hw_queues_unlock(hdev);

return rc; } @@ -328,7 +318,7 @@ static void ext_queue_schedule_job(struct hl_cs_job *job) cq->pi = hl_cq_inc_ptr(cq->pi);

submit_bd: - ext_and_hw_queue_submit_bd(hdev, q, ctl, len, ptr); + hl_hw_queue_submit_bd(hdev, q, ctl, len, ptr); }

/* @@ -407,7 +397,7 @@ static void hw_queue_schedule_job(struct hl_cs_job *job) else ptr = (u64) (uintptr_t) job->user_cb;

- ext_and_hw_queue_submit_bd(hdev, q, ctl, len, ptr); + hl_hw_queue_submit_bd(hdev, q, ctl, len, ptr); }

static int init_signal_cs(struct hl_device *hdev,

-- 2.30.2

Sasha Levin

2:32 a.m.

New subject: [PATCH AUTOSEL 5.14 04/21] drm/amdgpu: Fixes to returning VBIOS RAS EEPROM address

From: Luben Tuikov luben.tuikov@amd.com

[ Upstream commit a6a355a22f7a0efa6a11bc90b5161f394d51fe95 ]

1) Generalize the function--if the user didn't set i2c_address, still return true/false to indicate whether VBIOS contains the RAS EEPROM address. This function shouldn't evaluate whether the user set the i2c_address pointer or not.

2) Don't touch the caller's i2c_address, unless you have to--this function shouldn't have side effects.

3) Correctly set the function comment as a kernel-doc comment.

Cc: John Clements john.clements@amd.com Cc: Hawking Zhang Hawking.Zhang@amd.com Cc: Alex Deucher Alexander.Deucher@amd.com Signed-off-by: Luben Tuikov luben.tuikov@amd.com Reviewed-by: Alex Deucher Alexander.Deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 50 ++++++++++++------- 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c index 8f53837d4d3e..97178b307ed6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c @@ -468,14 +468,18 @@ bool amdgpu_atomfirmware_dynamic_boot_config_supported(struct amdgpu_device *ade return (fw_cap & ATOM_FIRMWARE_CAP_DYNAMIC_BOOT_CFG_ENABLE) ? true : false; }

-/* - * Helper function to query RAS EEPROM address - * - * @adev: amdgpu_device pointer +/** + * amdgpu_atomfirmware_ras_rom_addr -- Get the RAS EEPROM addr from VBIOS + * adev: amdgpu_device pointer + * i2c_address: pointer to u8; if not NULL, will contain + * the RAS EEPROM address if the function returns true * - * Return true if vbios supports ras rom address reporting + * Return true if VBIOS supports RAS EEPROM address reporting, + * else return false. If true and @i2c_address is not NULL, + * will contain the RAS ROM address. */ -bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev, uint8_t* i2c_address) +bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev, + u8 *i2c_address) { struct amdgpu_mode_info *mode_info = &adev->mode_info; int index; @@ -483,27 +487,39 @@ bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev, uint8_t* i2c_a union firmware_info *firmware_info; u8 frev, crev;

- if (i2c_address == NULL) - return false; - - *i2c_address = 0; - index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1, - firmwareinfo); + firmwareinfo);

if (amdgpu_atom_parse_data_header(adev->mode_info.atom_context, - index, &size, &frev, &crev, &data_offset)) { + index, &size, &frev, &crev, + &data_offset)) { /* support firmware_info 3.4 + */ if ((frev == 3 && crev >=4) || (frev > 3)) { firmware_info = (union firmware_info *) (mode_info->atom_context->bios + data_offset); - *i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr; + /* The ras_rom_i2c_slave_addr should ideally + * be a 19-bit EEPROM address, which would be + * used as is by the driver; see top of + * amdgpu_eeprom.c. + * + * When this is the case, 0 is of course a + * valid RAS EEPROM address, in which case, + * we'll drop the first "if (firm...)" and only + * leave the check for the pointer. + * + * The reason this works right now is because + * ras_rom_i2c_slave_addr contains the EEPROM + * device type qualifier 1010b in the top 4 + * bits. + */ + if (firmware_info->v34.ras_rom_i2c_slave_addr) { + if (i2c_address) + *i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr; + return true; + } } }

- if (*i2c_address != 0) - return true; - return false; }

-- 2.30.2

Sasha Levin

2:32 a.m.

New subject: [PATCH AUTOSEL 5.14 05/21] drm/amd/display: Fix memory leak reported by coverity

From: Anson Jacob Anson.Jacob@amd.com

[ Upstream commit 03388a347fe7cf7c3bdf68b0823ba316d177d470 ]

Free memory allocated if any of the previous allocations failed.

...

CID 1487129:  Resource leaks  (RESOURCE_LEAK)
Variable "vpg" going out of scope leaks the storage it points to.

Addresses-Coverity-ID: 1487129: ("Resource leaks")

Reviewed-by: Aric Cyr aric.cyr@amd.com Acked-by: Mikita Lipski mikita.lipski@amd.com Signed-off-by: Anson Jacob Anson.Jacob@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c index dc7823d23ba8..dd38796ba30a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c @@ -510,8 +510,12 @@ static struct stream_encoder *dcn303_stream_encoder_create(enum engine_id eng_id vpg = dcn303_vpg_create(ctx, vpg_inst); afmt = dcn303_afmt_create(ctx, afmt_inst);

- if (!enc1 || !vpg || !afmt) + if (!enc1 || !vpg || !afmt) { + kfree(enc1); + kfree(vpg); + kfree(afmt); return NULL; + }

dcn30_dio_stream_encoder_construct(enc1, ctx, ctx->dc_bios, eng_id, vpg, afmt, &stream_enc_regs[eng_id], &se_shift, &se_mask);

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 06/21] drm/amdgpu: fix fdinfo race with process exit

From: Philip Yang Philip.Yang@amd.com

[ Upstream commit d7eff46c214c036606dd3cd305bd5a128aecfe8c ]

Get process vm root BO ref in case process is exiting and root BO is freed, to avoid NULL pointer dereference backtrace:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 Call Trace: amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu] seq_show+0x12c/0x180 seq_read+0x153/0x410 vfs_read+0x91/0x140[ 3427.206183] ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca

Signed-off-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c index d94c5419ec25..5a6857c44bb6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c @@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f) uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0; struct drm_file *file = f->private_data; struct amdgpu_device *adev = drm_to_adev(file->minor->dev); + struct amdgpu_bo *root; int ret;

ret = amdgpu_file_to_fpriv(f, &fpriv); @@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f) dev = PCI_SLOT(adev->pdev->devfn); fn = PCI_FUNC(adev->pdev->devfn);

- ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false); + root = amdgpu_bo_ref(fpriv->vm.root.bo); + if (!root) + return; + + ret = amdgpu_bo_reserve(root, false); if (ret) { DRM_ERROR("Fail to reserve bo\n"); return; } amdgpu_vm_get_memory(&fpriv->vm, &vram_mem, &gtt_mem, &cpu_mem); - amdgpu_bo_unreserve(fpriv->vm.root.bo); + amdgpu_bo_unreserve(root); + amdgpu_bo_unref(&root); + seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n", domain, bus, dev, fn, fpriv->vm.pasid); seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 07/21] habanalabs: add validity check for event ID received from F/W

From: Ofir Bitton obitton@habana.ai

[ Upstream commit a6c849012b0f51c674f52384bd9a4f3dc0a33c31 ]

Currently there is no validity check for event ID received from F/W, Thus exposing driver to memory overrun.

Signed-off-by: Ofir Bitton obitton@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/gaudi/gaudi.c | 6 ++++++ drivers/misc/habanalabs/goya/goya.c | 6 ++++++ 2 files changed, 12 insertions(+)

diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c index aa8a0ca5aca2..409f05c962f2 100644 --- a/drivers/misc/habanalabs/gaudi/gaudi.c +++ b/drivers/misc/habanalabs/gaudi/gaudi.c @@ -7809,6 +7809,12 @@ static void gaudi_handle_eqe(struct hl_device *hdev, u8 cause; bool reset_required;

+ if (event_type >= GAUDI_EVENT_SIZE) { + dev_err(hdev->dev, "Event type %u exceeds maximum of %u", + event_type, GAUDI_EVENT_SIZE - 1); + return; + } + gaudi->events_stat[event_type]++; gaudi->events_stat_aggregate[event_type]++;

diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c index 755e08cf2ecc..bfb22f96c1a3 100644 --- a/drivers/misc/habanalabs/goya/goya.c +++ b/drivers/misc/habanalabs/goya/goya.c @@ -4797,6 +4797,12 @@ void goya_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_entry) >> EQ_CTL_EVENT_TYPE_SHIFT); struct goya_device *goya = hdev->asic_specific;

+ if (event_type >= GOYA_ASYNC_EVENT_ID_SIZE) { + dev_err(hdev->dev, "Event type %u exceeds maximum of %u", + event_type, GOYA_ASYNC_EVENT_ID_SIZE - 1); + return; + } + goya->events_stat[event_type]++; goya->events_stat_aggregate[event_type]++;

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 08/21] habanalabs: fix mmu node address resolution in debugfs

From: Yuri Nudelman ynudelman@habana.ai

[ Upstream commit 09ae43043c748423a5dcdc7bb1e63e4dcabe9bd6 ]

The address resolution via debugfs was not taking into consideration the page offset, resulting in a wrong address.

Signed-off-by: Yuri Nudelman ynudelman@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/debugfs.c b/drivers/misc/habanalabs/common/debugfs.c index 703d79fb6f3f..379529bffc70 100644 --- a/drivers/misc/habanalabs/common/debugfs.c +++ b/drivers/misc/habanalabs/common/debugfs.c @@ -349,7 +349,7 @@ static int mmu_show(struct seq_file *s, void *data) return 0; }

- phys_addr = hops_info.hop_info[hops_info.used_hops - 1].hop_pte_val; + hl_mmu_va_to_pa(ctx, virt_addr, &phys_addr);

if (hops_info.scrambled_vaddr && (dev_entry->mmu_addr != hops_info.scrambled_vaddr))

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 09/21] habanalabs: add "in device creation" status

From: Omer Shpigelman oshpigelman@habana.ai

[ Upstream commit 71731090ab17a208a58020e4b342fdfee280458a ]

On init, the disabled state is cleared right before hw_init and that causes the device to report on "Operational" state before the device initialization is finished. Although the char device is not yet exposed to the user at this stage, the sysfs entries are exposed.

This can cause errors in monitoring applications that use the sysfs entries.

In order to avoid this, a new state "in device creation" is introduced to ne reported when the device is not disabled but is still in init flow.

Signed-off-by: Omer Shpigelman oshpigelman@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/device.c | 3 +++ drivers/misc/habanalabs/common/habanalabs.h | 2 +- .../misc/habanalabs/common/habanalabs_drv.c | 8 ++++++-- drivers/misc/habanalabs/common/sysfs.c | 20 +++++++------------ include/uapi/misc/habanalabs.h | 4 +++- 5 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c index 0a788a13f2c1..846a7e78582c 100644 --- a/drivers/misc/habanalabs/common/device.c +++ b/drivers/misc/habanalabs/common/device.c @@ -23,6 +23,8 @@ enum hl_device_status hl_device_status(struct hl_device *hdev) status = HL_DEVICE_STATUS_NEEDS_RESET; else if (hdev->disabled) status = HL_DEVICE_STATUS_MALFUNCTION; + else if (!hdev->init_done) + status = HL_DEVICE_STATUS_IN_DEVICE_CREATION; else status = HL_DEVICE_STATUS_OPERATIONAL;

@@ -44,6 +46,7 @@ bool hl_device_operational(struct hl_device *hdev, case HL_DEVICE_STATUS_NEEDS_RESET: return false; case HL_DEVICE_STATUS_OPERATIONAL: + case HL_DEVICE_STATUS_IN_DEVICE_CREATION: default: return true; } diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h index c63e26da5135..c48d130a9049 100644 --- a/drivers/misc/habanalabs/common/habanalabs.h +++ b/drivers/misc/habanalabs/common/habanalabs.h @@ -1798,7 +1798,7 @@ struct hl_dbg_device_entry {

#define HL_STR_MAX 32

-#define HL_DEV_STS_MAX (HL_DEVICE_STATUS_NEEDS_RESET + 1) +#define HL_DEV_STS_MAX (HL_DEVICE_STATUS_LAST + 1)

/* Theoretical limit only. A single host can only contain up to 4 or 8 PCIe * x16 cards. In extreme cases, there are hosts that can accommodate 16 cards. diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c index 4194cda2d04c..536451a9a16c 100644 --- a/drivers/misc/habanalabs/common/habanalabs_drv.c +++ b/drivers/misc/habanalabs/common/habanalabs_drv.c @@ -318,12 +318,16 @@ int create_hdev(struct hl_device **dev, struct pci_dev *pdev, hdev->asic_prop.fw_security_enabled = false;

/* Assign status description string */ - strncpy(hdev->status[HL_DEVICE_STATUS_MALFUNCTION], - "disabled", HL_STR_MAX); + strncpy(hdev->status[HL_DEVICE_STATUS_OPERATIONAL], + "operational", HL_STR_MAX); strncpy(hdev->status[HL_DEVICE_STATUS_IN_RESET], "in reset", HL_STR_MAX); + strncpy(hdev->status[HL_DEVICE_STATUS_MALFUNCTION], + "disabled", HL_STR_MAX); strncpy(hdev->status[HL_DEVICE_STATUS_NEEDS_RESET], "needs reset", HL_STR_MAX); + strncpy(hdev->status[HL_DEVICE_STATUS_IN_DEVICE_CREATION], + "in device creation", HL_STR_MAX);

hdev->major = hl_major; hdev->reset_on_lockup = reset_on_lockup; diff --git a/drivers/misc/habanalabs/common/sysfs.c b/drivers/misc/habanalabs/common/sysfs.c index db72df282ef8..34f9f2779962 100644 --- a/drivers/misc/habanalabs/common/sysfs.c +++ b/drivers/misc/habanalabs/common/sysfs.c @@ -9,8 +9,7 @@

#include <linux/pci.h>

-long hl_get_frequency(struct hl_device *hdev, u32 pll_index, - bool curr) +long hl_get_frequency(struct hl_device *hdev, u32 pll_index, bool curr) { struct cpucp_packet pkt; u32 used_pll_idx; @@ -44,8 +43,7 @@ long hl_get_frequency(struct hl_device *hdev, u32 pll_index, return (long) result; }

-void hl_set_frequency(struct hl_device *hdev, u32 pll_index, - u64 freq) +void hl_set_frequency(struct hl_device *hdev, u32 pll_index, u64 freq) { struct cpucp_packet pkt; u32 used_pll_idx; @@ -285,16 +283,12 @@ static ssize_t status_show(struct device *dev, struct device_attribute *attr, char *buf) { struct hl_device *hdev = dev_get_drvdata(dev); - char *str; + char str[HL_STR_MAX];

- if (atomic_read(&hdev->in_reset)) - str = "In reset"; - else if (hdev->disabled) - str = "Malfunction"; - else if (hdev->needs_reset) - str = "Needs Reset"; - else - str = "Operational"; + strscpy(str, hdev->status[hl_device_status(hdev)], HL_STR_MAX); + + /* use uppercase for backward compatibility */ + str[0] = 'A' + (str[0] - 'a');

return sprintf(buf, "%s\n", str); } diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h index a47a731e4527..b4b681b81df8 100644 --- a/include/uapi/misc/habanalabs.h +++ b/include/uapi/misc/habanalabs.h @@ -276,7 +276,9 @@ enum hl_device_status { HL_DEVICE_STATUS_OPERATIONAL, HL_DEVICE_STATUS_IN_RESET, HL_DEVICE_STATUS_MALFUNCTION, - HL_DEVICE_STATUS_NEEDS_RESET + HL_DEVICE_STATUS_NEEDS_RESET, + HL_DEVICE_STATUS_IN_DEVICE_CREATION, + HL_DEVICE_STATUS_LAST = HL_DEVICE_STATUS_IN_DEVICE_CREATION };

/* Opcode for management ioctl

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 10/21] habanalabs: cannot sleep while holding spinlock

From: farah kassabri fkassabri@habana.ai

[ Upstream commit 607b1468c2263e082d74c1a3e71399a9026b41ce ]

Fix 2 areas in the code where it's possible the code will go to sleep while holding a spinlock.

Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: farah kassabri fkassabri@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/command_buffer.c | 2 -- drivers/misc/habanalabs/common/memory.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/misc/habanalabs/common/command_buffer.c b/drivers/misc/habanalabs/common/command_buffer.c index 719168c980a4..402ac2395fc8 100644 --- a/drivers/misc/habanalabs/common/command_buffer.c +++ b/drivers/misc/habanalabs/common/command_buffer.c @@ -314,8 +314,6 @@ int hl_cb_create(struct hl_device *hdev, struct hl_cb_mgr *mgr,

spin_lock(&mgr->cb_lock); rc = idr_alloc(&mgr->cb_handles, cb, 1, 0, GFP_ATOMIC); - if (rc < 0) - rc = idr_alloc(&mgr->cb_handles, cb, 1, 0, GFP_KERNEL); spin_unlock(&mgr->cb_lock);

if (rc < 0) { diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c index af339ce1ab4f..fcadde594a58 100644 --- a/drivers/misc/habanalabs/common/memory.c +++ b/drivers/misc/habanalabs/common/memory.c @@ -124,7 +124,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,

spin_lock(&vm->idr_lock); handle = idr_alloc(&vm->phys_pg_pack_handles, phys_pg_pack, 1, 0, - GFP_KERNEL); + GFP_ATOMIC); spin_unlock(&vm->idr_lock);

if (handle < 0) {

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 11/21] pwm: img: Don't modify HW state in .remove() callback

From: Uwe Kleine-König u.kleine-koenig@pengutronix.de

[ Upstream commit c68eb29c8e9067c08175dd0414f6984f236f719d ]

A consumer is expected to disable a PWM before calling pwm_put(). And if they didn't there is hopefully a good reason (or the consumer needs fixing). Also if disabling an enabled PWM was the right thing to do, this should better be done in the framework instead of in each low level driver.

Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-img.c | 16 ---------------- 1 file changed, 16 deletions(-)

diff --git a/drivers/pwm/pwm-img.c b/drivers/pwm/pwm-img.c index 11b16ecc4f96..18d8e34d0d08 100644 --- a/drivers/pwm/pwm-img.c +++ b/drivers/pwm/pwm-img.c @@ -326,23 +326,7 @@ static int img_pwm_probe(struct platform_device *pdev) static int img_pwm_remove(struct platform_device *pdev) { struct img_pwm_chip *pwm_chip = platform_get_drvdata(pdev); - u32 val; - unsigned int i; - int ret; - - ret = pm_runtime_get_sync(&pdev->dev); - if (ret < 0) { - pm_runtime_put(&pdev->dev); - return ret; - } - - for (i = 0; i < pwm_chip->chip.npwm; i++) { - val = img_pwm_readl(pwm_chip, PWM_CTRL_CFG); - val &= ~BIT(i); - img_pwm_writel(pwm_chip, PWM_CTRL_CFG, val); - }

- pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); if (!pm_runtime_status_suspended(&pdev->dev)) img_pwm_runtime_suspend(&pdev->dev);

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 12/21] pwm: rockchip: Don't modify HW state in .remove() callback

From: Uwe Kleine-König u.kleine-koenig@pengutronix.de

[ Upstream commit 9d768cd7fd42bb0be16f36aec48548fca5260759 ]

Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-rockchip.c | 14 -------------- 1 file changed, 14 deletions(-)

diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c index cbe900877724..8fcef29948d7 100644 --- a/drivers/pwm/pwm-rockchip.c +++ b/drivers/pwm/pwm-rockchip.c @@ -384,20 +384,6 @@ static int rockchip_pwm_remove(struct platform_device *pdev) { struct rockchip_pwm_chip *pc = platform_get_drvdata(pdev);

- /* - * Disable the PWM clk before unpreparing it if the PWM device is still - * running. This should only happen when the last PWM user left it - * enabled, or when nobody requested a PWM that was previously enabled - * by the bootloader. - * - * FIXME: Maybe the core should disable all PWM devices in - * pwmchip_remove(). In this case we'd only have to call - * clk_unprepare() after pwmchip_remove(). - * - */ - if (pwm_is_enabled(pc->chip.pwms)) - clk_disable(pc->clk); - clk_unprepare(pc->pclk); clk_unprepare(pc->clk);

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 13/21] pwm: stm32-lp: Don't modify HW state in .remove() callback

From: Uwe Kleine-König u.kleine-koenig@pengutronix.de

[ Upstream commit d44084c93427bb0a9261432db1a8ca76a42d805e ]

diff --git a/drivers/pwm/pwm-stm32-lp.c b/drivers/pwm/pwm-stm32-lp.c index 93dd03618465..e4a10aac354d 100644 --- a/drivers/pwm/pwm-stm32-lp.c +++ b/drivers/pwm/pwm-stm32-lp.c @@ -222,8 +222,6 @@ static int stm32_pwm_lp_remove(struct platform_device *pdev) { struct stm32_pwm_lp *priv = platform_get_drvdata(pdev);

- pwm_disable(&priv->chip.pwms[0]); - return pwmchip_remove(&priv->chip); }

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 14/21] nvmet: fixup buffer overrun in nvmet_subsys_attr_serial()

From: Hannes Reinecke hare@suse.de

[ Upstream commit f04064814c2a15c22ed9c803f9b634ef34f91092 ]

The serial number is copied into the buffer via memcpy_and_pad() with the length NVMET_SN_MAX_SIZE. So when printing out we also need to take just that length as anything beyond that will be uninitialized.

Signed-off-by: Hannes Reinecke hare@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/configfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c index 273555127188..fa88bf9cba4d 100644 --- a/drivers/nvme/target/configfs.c +++ b/drivers/nvme/target/configfs.c @@ -1067,7 +1067,8 @@ static ssize_t nvmet_subsys_attr_serial_show(struct config_item *item, { struct nvmet_subsys *subsys = to_subsys(item);

- return snprintf(page, PAGE_SIZE, "%s\n", subsys->serial); + return snprintf(page, PAGE_SIZE, "%*s\n", + NVMET_SN_MAX_SIZE, subsys->serial); }

static ssize_t

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 15/21] block: genhd: don't call blkdev_show() with major_names_lock held

From: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp

[ Upstream commit dfbb3409b27fa42b96f5727a80d3ceb6a8663991 ]

If CONFIG_BLK_DEV_LOOP && CONFIG_MTD (at least; there might be other combinations), lockdep complains circular locking dependency at __loop_clr_fd(), for major_names_lock serves as a locking dependency aggregating hub across multiple block modules.

====================================================== WARNING: possible circular locking dependency detected 5.14.0+ #757 Tainted: G E ------------------------------------------------------ systemd-udevd/7568 is trying to acquire lock: ffff88800f334d48 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x70/0x560

but task is already holding lock: ffff888014a7d4a0 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x4d/0x400 [loop]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #6 (&lo->lo_mutex){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_killable_nested+0x17/0x20 lo_open+0x23/0x50 [loop] blkdev_get_by_dev+0x199/0x540 blkdev_open+0x58/0x90 do_dentry_open+0x144/0x3a0 path_openat+0xa57/0xda0 do_filp_open+0x9f/0x140 do_sys_openat2+0x71/0x150 __x64_sys_openat+0x78/0xa0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #5 (&disk->open_mutex){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_nested+0x17/0x20 bd_register_pending_holders+0x20/0x100 device_add_disk+0x1ae/0x390 loop_add+0x29c/0x2d0 [loop] blk_request_module+0x5a/0xb0 blkdev_get_no_open+0x27/0xa0 blkdev_get_by_dev+0x5f/0x540 blkdev_open+0x58/0x90 do_dentry_open+0x144/0x3a0 path_openat+0xa57/0xda0 do_filp_open+0x9f/0x140 do_sys_openat2+0x71/0x150 __x64_sys_openat+0x78/0xa0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #4 (major_names_lock){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_nested+0x17/0x20 blkdev_show+0x19/0x80 devinfo_show+0x52/0x60 seq_read_iter+0x2d5/0x3e0 proc_reg_read_iter+0x41/0x80 vfs_read+0x2ac/0x330 ksys_read+0x6b/0xd0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #3 (&p->lock){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_nested+0x17/0x20 seq_read_iter+0x37/0x3e0 generic_file_splice_read+0xf3/0x170 splice_direct_to_actor+0x14e/0x350 do_splice_direct+0x84/0xd0 do_sendfile+0x263/0x430 __se_sys_sendfile64+0x96/0xc0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #2 (sb_writers#3){.+.+}-{0:0}: lock_acquire+0xbe/0x1f0 lo_write_bvec+0x96/0x280 [loop] loop_process_work+0xa68/0xc10 [loop] process_one_work+0x293/0x480 worker_thread+0x23d/0x4b0 kthread+0x163/0x180 ret_from_fork+0x1f/0x30

-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: lock_acquire+0xbe/0x1f0 process_one_work+0x280/0x480 worker_thread+0x23d/0x4b0 kthread+0x163/0x180 ret_from_fork+0x1f/0x30

-> #0 ((wq_completion)loop0){+.+.}-{0:0}: validate_chain+0x1f0d/0x33e0 __lock_acquire+0x92d/0x1030 lock_acquire+0xbe/0x1f0 flush_workqueue+0x8c/0x560 drain_workqueue+0x80/0x140 destroy_workqueue+0x47/0x4f0 __loop_clr_fd+0xb4/0x400 [loop] blkdev_put+0x14a/0x1d0 blkdev_close+0x1c/0x20 __fput+0xfd/0x220 task_work_run+0x69/0xc0 exit_to_user_mode_prepare+0x1ce/0x1f0 syscall_exit_to_user_mode+0x26/0x60 do_syscall_64+0x4c/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex

Possible unsafe locking scenario:

CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);

*** DEADLOCK ***

2 locks held by systemd-udevd/7568: #0: ffff888012554128 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_put+0x4c/0x1d0 #1: ffff888014a7d4a0 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x4d/0x400 [loop]

stack backtrace: CPU: 0 PID: 7568 Comm: systemd-udevd Tainted: G E 5.14.0+ #757 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 Call Trace: dump_stack_lvl+0x79/0xbf print_circular_bug+0x5d6/0x5e0 ? stack_trace_save+0x42/0x60 ? save_trace+0x3d/0x2d0 check_noncircular+0x10b/0x120 validate_chain+0x1f0d/0x33e0 ? __lock_acquire+0x953/0x1030 ? __lock_acquire+0x953/0x1030 __lock_acquire+0x92d/0x1030 ? flush_workqueue+0x70/0x560 lock_acquire+0xbe/0x1f0 ? flush_workqueue+0x70/0x560 flush_workqueue+0x8c/0x560 ? flush_workqueue+0x70/0x560 ? sched_clock_cpu+0xe/0x1a0 ? drain_workqueue+0x41/0x140 drain_workqueue+0x80/0x140 destroy_workqueue+0x47/0x4f0 ? blk_mq_freeze_queue_wait+0xac/0xd0 __loop_clr_fd+0xb4/0x400 [loop] ? __mutex_unlock_slowpath+0x35/0x230 blkdev_put+0x14a/0x1d0 blkdev_close+0x1c/0x20 __fput+0xfd/0x220 task_work_run+0x69/0xc0 exit_to_user_mode_prepare+0x1ce/0x1f0 syscall_exit_to_user_mode+0x26/0x60 do_syscall_64+0x4c/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f0fd4c661f7 Code: 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 13 fc ff ff RSP: 002b:00007ffd1c9e9fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 00007f0fd46be6c8 RCX: 00007f0fd4c661f7 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006 RBP: 0000000000000006 R08: 000055fff1eaf400 R09: 0000000000000000 R10: 00007f0fd46be6c8 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000002f08 R15: 00007ffd1c9ea050

Commit 1c500ad706383f1a ("loop: reduce the loop_ctl_mutex scope") is for breaking "loop_ctl_mutex => &lo->lo_mutex" dependency chain. But enabling a different block module results in forming circular locking dependency due to shared major_names_lock mutex.

The simplest fix is to call probe function without holding major_names_lock [1], but Christoph Hellwig does not like such idea. Therefore, instead of holding major_names_lock in blkdev_show(), introduce a different lock for blkdev_show() in order to break "sb_writers#$N => &p->lock => major_names_lock" dependency chain.

Link: https://lkml.kernel.org/r/b2af8a5b-3c1b-204e-7f56-bea0b15848d6@i-love.sakura... [1] Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Link: https://lore.kernel.org/r/18a02da2-0bf3-550e-b071-2b4ab13c49f0@i-love.sakura... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/genhd.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c index 298ee78c1bda..9aba65404416 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -164,6 +164,7 @@ static struct blk_major_name { void (*probe)(dev_t devt); } *major_names[BLKDEV_MAJOR_HASH_SIZE]; static DEFINE_MUTEX(major_names_lock); +static DEFINE_SPINLOCK(major_names_spinlock);

/* index in the above - for now: assume no multimajor ranges */ static inline int major_to_index(unsigned major) @@ -176,11 +177,11 @@ void blkdev_show(struct seq_file *seqf, off_t offset) { struct blk_major_name *dp;

- mutex_lock(&major_names_lock); + spin_lock(&major_names_spinlock); for (dp = major_names[major_to_index(offset)]; dp; dp = dp->next) if (dp->major == offset) seq_printf(seqf, "%3d %s\n", dp->major, dp->name); - mutex_unlock(&major_names_lock); + spin_unlock(&major_names_spinlock); } #endif /* CONFIG_PROC_FS */

@@ -252,6 +253,7 @@ int __register_blkdev(unsigned int major, const char *name, p->next = NULL; index = major_to_index(major);

+ spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) { if ((*n)->major == major) break; @@ -260,6 +262,7 @@ int __register_blkdev(unsigned int major, const char *name, *n = p; else ret = -EBUSY; + spin_unlock(&major_names_spinlock);

if (ret < 0) { printk("register_blkdev: cannot get major %u for %s\n", @@ -279,6 +282,7 @@ void unregister_blkdev(unsigned int major, const char *name) int index = major_to_index(major);

mutex_lock(&major_names_lock); + spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) if ((*n)->major == major) break; @@ -288,6 +292,7 @@ void unregister_blkdev(unsigned int major, const char *name) p = *n; *n = p->next; } + spin_unlock(&major_names_spinlock); mutex_unlock(&major_names_lock); kfree(p); }

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 16/21] blk-throttle: fix UAF by deleteing timer in blk_throtl_exit()

From: Li Jinlin lijinlin3@huawei.com

[ Upstream commit 884f0e84f1e3195b801319c8ec3d5774e9bf2710 ]

The pending timer has been set up in blk_throtl_init(). However, the timer is not deleted in blk_throtl_exit(). This means that the timer handler may still be running after freeing the timer, which would result in a use-after-free.

Fix by calling del_timer_sync() to delete the timer in blk_throtl_exit().

Signed-off-by: Li Jinlin lijinlin3@huawei.com Link: https://lore.kernel.org/r/20210907121242.2885564-1-lijinlin3@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-throttle.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c index b1b22d863bdf..d0cc77c7b8bd 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2426,6 +2426,7 @@ int blk_throtl_init(struct request_queue *q) void blk_throtl_exit(struct request_queue *q) { BUG_ON(!q->td); + del_timer_sync(&q->td->service_queue.pending_timer); throtl_shutdown_wq(q); blkcg_deactivate_policy(q, &blkcg_policy_throtl); free_percpu(q->td->latency_buckets[READ]);

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 17/21] blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues

From: Song Liu songliubraving@fb.com

[ Upstream commit 7f2a6a69f7ced6db8220298e0497cf60482a9d4b ]

Limiting number of request to BLK_MAX_REQUEST_COUNT at blk_plug hurts performance for large md arrays. [1] shows resync speed of md array drops for md array with more than 16 HDDs.

Fix this by allowing more request at plug queue. The multiple_queue flag is used to only apply higher limit to multiple queue cases.

[1] https://lore.kernel.org/linux-raid/CAFDAVznS71BXW8Jxv6k9dXc2iR3ysX3iZRBww_rz... Tested-by: Marcin Wanat marcin.wanat@gmail.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c index 9d4fdc2be88a..9c64f0025a56 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2135,6 +2135,18 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) } }

+/* + * Allow 4x BLK_MAX_REQUEST_COUNT requests on plug queue for multiple + * queues. This is important for md arrays to benefit from merging + * requests. + */ +static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug) +{ + if (plug->multiple_queues) + return BLK_MAX_REQUEST_COUNT * 4; + return BLK_MAX_REQUEST_COUNT; +} + /** * blk_mq_submit_bio - Create and send a request to block device. * @bio: Bio pointer. @@ -2231,7 +2243,7 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) else last = list_entry_rq(plug->mq_list.prev);

- if (request_count >= BLK_MAX_REQUEST_COUNT || (last && + if (request_count >= blk_plug_max_rq_count(plug) || (last && blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) { blk_flush_plug_list(plug, false); trace_block_plug(q);

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 18/21] rtc: rx8010: select REGMAP_I2C

From: Yu-Tung Chang mtwget@gmail.com

[ Upstream commit 0c45d3e24ef3d3d87c5e0077b8f38d1372af7176 ]

The rtc-rx8010 uses the I2C regmap but doesn't select it in Kconfig so depending on the configuration the build may fail. Fix it.

Signed-off-by: Yu-Tung Chang mtwget@gmail.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Link: https://lore.kernel.org/r/20210830052532.40356-1-mtwget@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/Kconfig | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig index 12153d5801ce..f7bf87097a9f 100644 --- a/drivers/rtc/Kconfig +++ b/drivers/rtc/Kconfig @@ -624,6 +624,7 @@ config RTC_DRV_FM3130

config RTC_DRV_RX8010 tristate "Epson RX8010SJ" + select REGMAP_I2C help If you say yes here you get support for the Epson RX8010SJ RTC chip.

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 19/21] sched/idle: Make the idle timer expire in hard interrupt context

From: Sebastian Andrzej Siewior bigeasy@linutronix.de

[ Upstream commit 9848417926353daa59d2b05eb26e185063dbac6e ]

The intel powerclamp driver will setup a per-CPU worker with RT priority. The worker will then invoke play_idle() in which it remains in the idle poll loop until it is stopped by the timer it started earlier.

That timer needs to expire in hard interrupt context on PREEMPT_RT. Otherwise the timer will expire in ksoftirqd as a SOFT timer but that task won't be scheduled on the CPU because its priority is lower than the priority of the worker which is in the idle loop.

Always expire the idle timer in hard interrupt context.

Reported-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20210906113034.jgfxrjdvxnjqgtmc@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/idle.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 912b47aa99d8..d17b0a5ce6ac 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -379,10 +379,10 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) cpuidle_use_deepest_state(latency_ns);

it.done = 0; - hrtimer_init_on_stack(&it.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer_init_on_stack(&it.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); it.timer.function = idle_inject_timer_fn; hrtimer_start(&it.timer, ns_to_ktime(duration_ns), - HRTIMER_MODE_REL_PINNED); + HRTIMER_MODE_REL_PINNED_HARD);

while (!READ_ONCE(it.done)) do_idle();

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 20/21] cifs: properly invalidate cached root handle when closing it

From: Enzo Matsumiya ematsumiya@suse.de

[ Upstream commit 9351590f51cdda49d0265932a37f099950998504 ]

Cached root file was not being completely invalidated sometimes.

Reproducing: - With a DFS share with 2 targets, one disabled and one enabled - start some I/O on the mount # while true; do ls /mnt/dfs; done - at the same time, disable the enabled target and enable the disabled one - wait for DFS cache to expire - on reconnect, the previous cached root handle should be invalid, but open_cached_dir_by_dentry() will still try to use it, but throws a use-after-free warning (kref_get())

Make smb2_close_cached_fid() invalidate all fields every time, but only send an SMB2_close() when the entry is still valid.

Signed-off-by: Enzo Matsumiya ematsumiya@suse.de Reviewed-by: Paulo Alcantara (SUSE) pc@cjr.nz Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2ops.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index 2dfd0d8297eb..1b9de38a136a 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -689,13 +689,19 @@ smb2_close_cached_fid(struct kref *ref) cifs_dbg(FYI, "clear cached root file handle\n"); SMB2_close(0, cfid->tcon, cfid->fid->persistent_fid, cfid->fid->volatile_fid); - cfid->is_valid = false; - cfid->file_all_info_is_valid = false; - cfid->has_lease = false; - if (cfid->dentry) { - dput(cfid->dentry); - cfid->dentry = NULL; - } + } + + /* + * We only check validity above to send SMB2_close, + * but we still need to invalidate these entries + * when this function is called + */ + cfid->is_valid = false; + cfid->file_all_info_is_valid = false; + cfid->has_lease = false; + if (cfid->dentry) { + dput(cfid->dentry); + cfid->dentry = NULL; } }

-- 2.30.2

Sasha Levin

2:33 a.m.

New subject: [PATCH AUTOSEL 5.14 21/21] io_uring: fix off-by-one in BUILD_BUG_ON check of __REQ_F_LAST_BIT

From: Hao Xu haoxu@linux.alibaba.com

[ Upstream commit 32c2d33e0b7c4ea53284d5d9435dd022b582c8cf ]

Build check of __REQ_F_LAST_BIT should be larger than, not equal or larger than. It's perfectly valid to have __REQ_F_LAST_BIT be 32, as that means that the last valid bit is 31 which does fit in the type.

Signed-off-by: Hao Xu haoxu@linux.alibaba.com Link: https://lore.kernel.org/r/20210907032243.114190-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c index a2e20a6fbfed..305c9923283a 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -10306,7 +10306,7 @@ static int __init io_uring_init(void) BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));

BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST); - BUILD_BUG_ON(__REQ_F_LAST_BIT >= 8 * sizeof(int)); + BUILD_BUG_ON(__REQ_F_LAST_BIT > 8 * sizeof(int));

req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);

-- 2.30.2

1493

days inactive

1493

days old

linux-stable-mirror@lists.linaro.org

20 comments

participants

tags (0)

participants (1)

Sasha Levin