From: Boris Brezillon boris.brezillon@collabora.com
[ Upstream commit e30cb0599799aac099209e3b045379613c80730e ]
drm_sched_entity_kill_jobs_cb() logic is omitting the last fence popped from the dependency array that was waited upon before drm_sched_entity_kill() was called (drm_sched_entity::dependency field), so we're basically waiting for all dependencies except one.
In theory, this wait shouldn't be needed because resources should have their users registered to the dma_resv object, thus guaranteeing that future jobs wanting to access these resources wait on all the previous users (depending on the access type, of course). But we want to keep these explicit waits in the kill entity path just in case.
Let's make sure we keep all dependencies in the array in drm_sched_job_dependency(), so we can iterate over the array and wait in drm_sched_entity_kill_jobs_cb().
We also make sure we wait on drm_sched_fence::finished if we were originally asked to wait on drm_sched_fence::scheduled. In that case, we assume the intent was to delegate the wait to the firmware/GPU or rely on the pipelining done at the entity/scheduler level, but when killing jobs, we really want to wait for completion not just scheduling.
v2: - Don't evict deps in drm_sched_job_dependency()
v3: - Always wait for drm_sched_fence::finished fences in drm_sched_entity_kill_jobs_cb() when we see a sched_fence
v4: - Fix commit message - Fix a use-after-free bug
v5: - Flag deps on which we should only wait for the scheduled event at insertion time
v6: - Back to v4 implementation - Add Christian's R-b
Cc: Frank Binns frank.binns@imgtec.com Cc: Sarah Walker sarah.walker@imgtec.com Cc: Donald Robson donald.robson@imgtec.com Cc: Luben Tuikov luben.tuikov@amd.com Cc: David Airlie airlied@gmail.com Cc: Daniel Vetter daniel@ffwll.ch Cc: Sumit Semwal sumit.semwal@linaro.org Cc: "Christian König" christian.koenig@amd.com Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Suggested-by: "Christian König" christian.koenig@amd.com Reviewed-by: "Christian König" christian.koenig@amd.com Acked-by: Luben Tuikov luben.tuikov@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20230619071921.3465992-1-boris... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/scheduler/sched_entity.c | 41 +++++++++++++++++++----- 1 file changed, 33 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index e0a8890a62e23..42021d1f7e016 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -155,16 +155,32 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, { struct drm_sched_job *job = container_of(cb, struct drm_sched_job, finish_cb); - int r; + unsigned long index;
dma_fence_put(f);
/* Wait for all dependencies to avoid data corruptions */ - while (!xa_empty(&job->dependencies)) { - f = xa_erase(&job->dependencies, job->last_dependency++); - r = dma_fence_add_callback(f, &job->finish_cb, - drm_sched_entity_kill_jobs_cb); - if (!r) + xa_for_each(&job->dependencies, index, f) { + struct drm_sched_fence *s_fence = to_drm_sched_fence(f); + + if (s_fence && f == &s_fence->scheduled) { + /* The dependencies array had a reference on the scheduled + * fence, and the finished fence refcount might have + * dropped to zero. Use dma_fence_get_rcu() so we get + * a NULL fence in that case. + */ + f = dma_fence_get_rcu(&s_fence->finished); + + /* Now that we have a reference on the finished fence, + * we can release the reference the dependencies array + * had on the scheduled fence. + */ + dma_fence_put(&s_fence->scheduled); + } + + xa_erase(&job->dependencies, index); + if (f && !dma_fence_add_callback(f, &job->finish_cb, + drm_sched_entity_kill_jobs_cb)) return;
dma_fence_put(f); @@ -394,8 +410,17 @@ static struct dma_fence * drm_sched_job_dependency(struct drm_sched_job *job, struct drm_sched_entity *entity) { - if (!xa_empty(&job->dependencies)) - return xa_erase(&job->dependencies, job->last_dependency++); + struct dma_fence *f; + + /* We keep the fence around, so we can iterate over all dependencies + * in drm_sched_entity_kill_jobs_cb() to ensure all deps are signaled + * before killing the job. + */ + f = xa_load(&job->dependencies, job->last_dependency); + if (f) { + job->last_dependency++; + return dma_fence_get(f); + }
if (job->sched->ops->prepare_job) return job->sched->ops->prepare_job(job, entity);
From: Winston Wen wentao@uniontech.com
[ Upstream commit ff7d80a9f2711bf3d9fe1cfb70b3fd15c50584b7 ]
We switch session state to SES_EXITING without cifs_tcp_ses_lock now, it may lead to potential use-after-free issue.
Consider the following execution processes:
Thread 1: __cifs_put_smb_ses() spin_lock(&cifs_tcp_ses_lock) if (--ses->ses_count > 0) spin_unlock(&cifs_tcp_ses_lock) return spin_unlock(&cifs_tcp_ses_lock) ---> **GAP** spin_lock(&ses->ses_lock) if (ses->ses_status == SES_GOOD) ses->ses_status = SES_EXITING spin_unlock(&ses->ses_lock)
Thread 2: cifs_find_smb_ses() spin_lock(&cifs_tcp_ses_lock) list_for_each_entry(ses, ...) spin_lock(&ses->ses_lock) if (ses->ses_status == SES_EXITING) spin_unlock(&ses->ses_lock) continue ... spin_unlock(&ses->ses_lock) if (ret) cifs_smb_ses_inc_refcount(ret) spin_unlock(&cifs_tcp_ses_lock)
If thread 1 is preempted in the gap and thread 2 start executing, thread 2 will get the session, and soon thread 1 will switch the session state to SES_EXITING and start releasing it, even though thread 1 had increased the session's refcount and still uses it.
So switch session state under cifs_tcp_ses_lock to eliminate this gap.
Signed-off-by: Winston Wen wentao@uniontech.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/connect.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c index 9d16626e7a669..165ecb222c19b 100644 --- a/fs/smb/client/connect.c +++ b/fs/smb/client/connect.c @@ -1963,15 +1963,16 @@ void __cifs_put_smb_ses(struct cifs_ses *ses) spin_unlock(&cifs_tcp_ses_lock); return; } + spin_lock(&ses->ses_lock); + if (ses->ses_status == SES_GOOD) + ses->ses_status = SES_EXITING; + spin_unlock(&ses->ses_lock); spin_unlock(&cifs_tcp_ses_lock);
/* ses_count can never go negative */ WARN_ON(ses->ses_count < 0);
spin_lock(&ses->ses_lock); - if (ses->ses_status == SES_GOOD) - ses->ses_status = SES_EXITING; - if (ses->ses_status == SES_EXITING && server->ops->logoff) { spin_unlock(&ses->ses_lock); cifs_free_ipc(ses);
From: Tuo Li islituo@gmail.com
[ Upstream commit 0e881c0a4b6146b7e856735226208f48251facd8 ]
The variable phba->fcf.fcf_flag is often protected by the lock phba->hbalock() when is accessed. Here is an example in lpfc_unregister_fcf_rescan():
spin_lock_irq(&phba->hbalock); phba->fcf.fcf_flag |= FCF_INIT_DISC; spin_unlock_irq(&phba->hbalock);
However, in the same function, phba->fcf.fcf_flag is assigned with 0 without holding the lock, and thus can cause a data race:
phba->fcf.fcf_flag = 0;
To fix this possible data race, a lock and unlock pair is added when accessing the variable phba->fcf.fcf_flag.
Reported-by: BassCheck bass@buaa.edu.cn Signed-off-by: Tuo Li islituo@gmail.com Link: https://lore.kernel.org/r/20230630024748.1035993-1-islituo@gmail.com Reviewed-by: Justin Tee justin.tee@broadcom.com Reviewed-by: Laurence Oberman loberman@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_hbadisc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index 5ba3a9ad95016..9d2feb69cae77 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -6961,7 +6961,9 @@ lpfc_unregister_fcf_rescan(struct lpfc_hba *phba) if (rc) return; /* Reset HBA FCF states after successful unregister FCF */ + spin_lock_irq(&phba->hbalock); phba->fcf.fcf_flag = 0; + spin_unlock_irq(&phba->hbalock); phba->fcf.current_rec.flag = 0;
/*
From: Damien Le Moal dlemoal@kernel.org
[ Upstream commit 03e51c4a74b91b0b1a9ca091029b0b58f014be81 ]
blk_revalidate_disk_zones() implements checks of the zones of a zoned block device, verifying that the zone size is a power of 2 number of sectors, that all zones (except possibly the last one) have the same size and that zones cover the entire addressing space of the device.
While these checks are appropriate to verify that well tested hardware devices have an adequate zone configurations, they lack in certain areas which may result in issues with emulated devices implemented with user drivers such as ublk or tcmu. Specifically, this function does not check if the device driver indicated support for the mandatory zone append writes, that is, if the device max_zone_append_sectors queue limit is set to a non-zero value. Additionally, invalid zones such as a zero length zone with a start sector equal to the device capacity will not be detected and result in out of bounds use of the zone bitmaps prepared with the callback function blk_revalidate_zone_cb().
Improve blk_revalidate_disk_zones() to address these inadequate checks, relying on the fact that all device drivers supporting zoned block devices must set the device zone size (chunk_sectors queue limit) and the max_zone_append_sectors queue limit before executing this function.
The check for a non-zero max_zone_append_sectors value is done in blk_revalidate_disk_zones() before executing the zone report. The zone report callback function blk_revalidate_zone_cb() is also modified to add a check that a zone start is below the device capacity.
The check that the zone size is a power of 2 number of sectors is moved to blk_revalidate_disk_zones() as the zone size is already known. Similarly, the number of zones of the device can be calculated in blk_revalidate_disk_zones() before executing the zone report.
The kdoc comment for blk_revalidate_disk_zones() is also updated to mention that device drivers must set the device zone size and the max_zone_append_sectors queue limit before calling this function.
Signed-off-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20230703024812.76778-6-dlemoal@kernel.org Reviewed-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-zoned.c | 86 +++++++++++++++++++++++++++-------------------- 1 file changed, 50 insertions(+), 36 deletions(-)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c index fce9082384d65..da92ce0c5da98 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -448,7 +448,6 @@ struct blk_revalidate_zone_args { unsigned long *conv_zones_bitmap; unsigned long *seq_zones_wlock; unsigned int nr_zones; - sector_t zone_sectors; sector_t sector; };
@@ -462,38 +461,34 @@ static int blk_revalidate_zone_cb(struct blk_zone *zone, unsigned int idx, struct gendisk *disk = args->disk; struct request_queue *q = disk->queue; sector_t capacity = get_capacity(disk); + sector_t zone_sectors = q->limits.chunk_sectors; + + /* Check for bad zones and holes in the zone report */ + if (zone->start != args->sector) { + pr_warn("%s: Zone gap at sectors %llu..%llu\n", + disk->disk_name, args->sector, zone->start); + return -ENODEV; + } + + if (zone->start >= capacity || !zone->len) { + pr_warn("%s: Invalid zone start %llu, length %llu\n", + disk->disk_name, zone->start, zone->len); + return -ENODEV; + }
/* * All zones must have the same size, with the exception on an eventual * smaller last zone. */ - if (zone->start == 0) { - if (zone->len == 0 || !is_power_of_2(zone->len)) { - pr_warn("%s: Invalid zoned device with non power of two zone size (%llu)\n", - disk->disk_name, zone->len); - return -ENODEV; - } - - args->zone_sectors = zone->len; - args->nr_zones = (capacity + zone->len - 1) >> ilog2(zone->len); - } else if (zone->start + args->zone_sectors < capacity) { - if (zone->len != args->zone_sectors) { + if (zone->start + zone->len < capacity) { + if (zone->len != zone_sectors) { pr_warn("%s: Invalid zoned device with non constant zone size\n", disk->disk_name); return -ENODEV; } - } else { - if (zone->len > args->zone_sectors) { - pr_warn("%s: Invalid zoned device with larger last zone size\n", - disk->disk_name); - return -ENODEV; - } - } - - /* Check for holes in the zone report */ - if (zone->start != args->sector) { - pr_warn("%s: Zone gap at sectors %llu..%llu\n", - disk->disk_name, args->sector, zone->start); + } else if (zone->len > zone_sectors) { + pr_warn("%s: Invalid zoned device with larger last zone size\n", + disk->disk_name); return -ENODEV; }
@@ -532,11 +527,13 @@ static int blk_revalidate_zone_cb(struct blk_zone *zone, unsigned int idx, * @disk: Target disk * @update_driver_data: Callback to update driver data on the frozen disk * - * Helper function for low-level device drivers to (re) allocate and initialize - * a disk request queue zone bitmaps. This functions should normally be called - * within the disk ->revalidate method for blk-mq based drivers. For BIO based - * drivers only q->nr_zones needs to be updated so that the sysfs exposed value - * is correct. + * Helper function for low-level device drivers to check and (re) allocate and + * initialize a disk request queue zone bitmaps. This functions should normally + * be called within the disk ->revalidate method for blk-mq based drivers. + * Before calling this function, the device driver must already have set the + * device zone size (chunk_sector limit) and the max zone append limit. + * For BIO based drivers, this function cannot be used. BIO based device drivers + * only need to set disk->nr_zones so that the sysfs exposed value is correct. * If the @update_driver_data callback function is not NULL, the callback is * executed with the device request queue frozen after all zones have been * checked. @@ -545,9 +542,9 @@ int blk_revalidate_disk_zones(struct gendisk *disk, void (*update_driver_data)(struct gendisk *disk)) { struct request_queue *q = disk->queue; - struct blk_revalidate_zone_args args = { - .disk = disk, - }; + sector_t zone_sectors = q->limits.chunk_sectors; + sector_t capacity = get_capacity(disk); + struct blk_revalidate_zone_args args = { }; unsigned int noio_flag; int ret;
@@ -556,13 +553,31 @@ int blk_revalidate_disk_zones(struct gendisk *disk, if (WARN_ON_ONCE(!queue_is_mq(q))) return -EIO;
- if (!get_capacity(disk)) - return -EIO; + if (!capacity) + return -ENODEV; + + /* + * Checks that the device driver indicated a valid zone size and that + * the max zone append limit is set. + */ + if (!zone_sectors || !is_power_of_2(zone_sectors)) { + pr_warn("%s: Invalid non power of two zone size (%llu)\n", + disk->disk_name, zone_sectors); + return -ENODEV; + } + + if (!q->limits.max_zone_append_sectors) { + pr_warn("%s: Invalid 0 maximum zone append limit\n", + disk->disk_name); + return -ENODEV; + }
/* * Ensure that all memory allocations in this context are done as if * GFP_NOIO was specified. */ + args.disk = disk; + args.nr_zones = (capacity + zone_sectors - 1) >> ilog2(zone_sectors); noio_flag = memalloc_noio_save(); ret = disk->fops->report_zones(disk, 0, UINT_MAX, blk_revalidate_zone_cb, &args); @@ -576,7 +591,7 @@ int blk_revalidate_disk_zones(struct gendisk *disk, * If zones where reported, make sure that the entire disk capacity * has been checked. */ - if (ret > 0 && args.sector != get_capacity(disk)) { + if (ret > 0 && args.sector != capacity) { pr_warn("%s: Missing zones from sector %llu\n", disk->disk_name, args.sector); ret = -ENODEV; @@ -589,7 +604,6 @@ int blk_revalidate_disk_zones(struct gendisk *disk, */ blk_mq_freeze_queue(q); if (ret > 0) { - blk_queue_chunk_sectors(q, args.zone_sectors); disk->nr_zones = args.nr_zones; swap(disk->seq_zones_wlock, args.seq_zones_wlock); swap(disk->conv_zones_bitmap, args.conv_zones_bitmap);
From: ruanjinjie ruanjinjie@huawei.com
[ Upstream commit 956578e3d397e00d6254dc7b5194d28587f98518 ]
As ntb_register_device() don't handle error of device_register(), if ntb_register_device() returns error in pci_vntb_probe(), name of kobject which is allocated in dev_set_name() called in device_add() is leaked.
As comment of device_add() says, it should call put_device() to drop the reference count that was set in device_initialize() when it fails, so the name can be freed in kobject_cleanup().
Signed-off-by: ruanjinjie ruanjinjie@huawei.com Signed-off-by: Jon Mason jdmason@kudzu.us Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/endpoint/functions/pci-epf-vntb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c index b7c7a8af99f4f..77306983ac456 100644 --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c @@ -1285,6 +1285,7 @@ static int pci_vntb_probe(struct pci_dev *pdev, const struct pci_device_id *id) return 0;
err_register_dev: + put_device(&ndev->ntb.dev); return -EINVAL; }
From: Stuart Hayhurst stuart.a.hayhurst@gmail.com
[ Upstream commit a343a7682acc56182d4b54777c358f5ec6d274e7 ]
Previously, support for the G502 had been attempted in commit '27fc32fd9417 ("HID: logitech-hidpp: add USB PID for a few more supported mice")'
This caused some issues and was reverted by 'addf3382c47c ("Revert "HID: logitech-hidpp: add USB PID for a few more supported mice"")'.
Since then, a new version of this mouse has been released (Lightpseed Wireless), and works correctly.
This device has support for battery reporting with the driver
Signed-off-by: Stuart Hayhurst stuart.a.hayhurst@gmail.com Reviewed-by: Bastien Nocera hadess@hadess.net Link: https://lore.kernel.org/r/20230630113818.13005-1-stuart.a.hayhurst@gmail.com Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-logitech-hidpp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index f7e06d433a915..22d5ba88954fc 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -4598,6 +4598,8 @@ static const struct hid_device_id hidpp_devices[] = {
{ /* Logitech G403 Wireless Gaming Mouse over USB */ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC082) }, + { /* Logitech G502 Lightspeed Wireless Gaming Mouse over USB */ + HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC08D) }, { /* Logitech G703 Gaming Mouse over USB */ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC087) }, { /* Logitech G703 Hero Gaming Mouse over USB */
From: Pankaj Raghav p.raghav@samsung.com
[ Upstream commit e5bb0988a5b622f58cc53dbdc044562229284d23 ]
Add the quirk as SM953 is reporting bogus namespace ID.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217593 Reported-by: Clemens Springsguth cspringsguth@gmail.com Tested-by: Clemens Springsguth cspringsguth@gmail.com Signed-off-by: Pankaj Raghav p.raghav@samsung.com Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 492f319ebdf37..8cef5805bfb2f 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3394,6 +3394,8 @@ static const struct pci_device_id nvme_id_table[] = { .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, }, { PCI_DEVICE(0x144d, 0xa809), /* Samsung MZALQ256HBJD 256G */ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, }, + { PCI_DEVICE(0x144d, 0xa802), /* Samsung SM953 */ + .driver_data = NVME_QUIRK_BOGUS_NID, }, { PCI_DEVICE(0x1cc4, 0x6303), /* UMIS RPJTJ512MGE1QDY 512G */ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, }, { PCI_DEVICE(0x1cc4, 0x6302), /* UMIS RPJTJ256MGE1QDY 256G */
linux-stable-mirror@lists.linaro.org