On 11/7/25 2:41 AM, Nilay Shroff wrote:
On 11/7/25 2:19 AM, Bart Van Assche wrote:
On 11/6/25 5:01 AM, Nilay Shroff wrote:
@@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock); - memflags = blk_mq_freeze_queue(q); - disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10); + data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock); - blk_mq_unfreeze_queue(q, memflags); return ret; }
I think we don't need data_race() here as disk->bdi->ra_pages is already protected by ->limits_lock. Furthermore, while you're at it, I’d suggest protecting the set/get access of ->ra_pages using ->limits_lock when it’s invoked from the ioctl context (BLKRASET/BLKRAGET).
I think that we really need the data_race() annotation here because there is plenty of code that reads ra_pages without using any locking.
I believe, in that case we need to annotate both reader and writer, using data_race(). Annotating only writer but not reader would not help suppress KCSAN reports of a data race.
No. As the comment above the data_race() macro explains, the data_race() macro makes memory accesses invisible to KCSAN. Hence, annotating writers only with data_race() is sufficient.
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL; - memflags = blk_mq_freeze_queue(q); - blk_queue_rq_timeout(q, msecs_to_jiffies(val)); - blk_mq_unfreeze_queue(q, memflags); + data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0)); return count; }
The use of data_race() above seems redundant, since the update to q->rq_timeout is already marked with WRITE_ONCE(). However, the read access to q->rq_timeout in a few places within the I/O hotpath is not marked and instead accessed directly using plain C-language loads.
That's an existing issue and an issue that falls outside the scope of my patch.
Hmm, I don’t think this issue falls outside the scope of the current patch. Without this change, it would not be possible for queue_io_timeout_store() to run concurrently with the I/O hotpath and update q->rq_timeout while it’s being read. Since this patch removes queue freeze from queue_io_timeout_store(), it can now potentially execute concurrently with the I/O hotpath, which could then manifest the KCSAN-reported data race described above.
Annotating all rq_timeout read accesses with READ_ONCE() would be cumbersome because there are plenty of direct rq_timeout accesses outside the block layer, e.g. in drivers/scsi/st.c (SCSI tape driver).
I prefer an alternative approach: annotate the q->rq_timeout update in blk_queue_rq_timeout() with both data_race() and WRITE_ONCE(). That guarantees that rq_timeout update happens once and prevents that KCSAN complains about rq_timeout reads.
Thanks,
Bart.