Freezing the request queue from inside sysfs store callbacks may cause a deadlock in combination with the dm-multipath driver and the queue_if_no_path option. Additionally, freezing the request queue slows down system boot on systems where sysfs attributes are set synchronously.
Fix this by removing the blk_mq_freeze_queue() / blk_mq_unfreeze_queue() calls from the store callbacks that do not strictly need these callbacks. This patch may cause a small delay in applying the new settings.
This patch affects the following sysfs attributes: * io_poll_delay * io_timeout * nomerges * read_ahead_kb * rq_affinity
Here is an example of a deadlock triggered by running test srp/002:
task:multipathd Call Trace: <TASK> __schedule+0x8c1/0x1bf0 schedule+0xdd/0x270 schedule_preempt_disabled+0x1c/0x30 __mutex_lock+0xb89/0x1650 mutex_lock_nested+0x1f/0x30 dm_table_set_restrictions+0x823/0xdf0 __bind+0x166/0x590 dm_swap_table+0x2a7/0x490 do_resume+0x1b1/0x610 dev_suspend+0x55/0x1a0 ctl_ioctl+0x3a5/0x7e0 dm_ctl_ioctl+0x12/0x20 __x64_sys_ioctl+0x127/0x1a0 x64_sys_call+0xe2b/0x17d0 do_syscall_64+0x96/0x3a0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> task:(udev-worker) Call Trace: <TASK> __schedule+0x8c1/0x1bf0 schedule+0xdd/0x270 blk_mq_freeze_queue_wait+0xf2/0x140 blk_mq_freeze_queue_nomemsave+0x23/0x30 queue_ra_store+0x14e/0x290 queue_attr_store+0x23e/0x2c0 sysfs_kf_write+0xde/0x140 kernfs_fop_write_iter+0x3b2/0x630 vfs_write+0x4fd/0x1390 ksys_write+0xfd/0x230 __x64_sys_write+0x76/0xc0 x64_sys_call+0x276/0x17d0 do_syscall_64+0x96/0x3a0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK>
Cc: Christoph Hellwig hch@lst.de Cc: Ming Lei ming.lei@redhat.com Cc: Nilay Shroff nilay@linux.ibm.com Cc: Martin Wilck mwilck@suse.com Cc: Benjamin Marzinski bmarzins@redhat.com Cc: stable@vger.kernel.org Fixes: af2814149883 ("block: freeze the queue in queue_attr_store") Signed-off-by: Bart Van Assche bvanassche@acm.org ---
Changes compared to v3: - Added two data_race() annotations.
Changes compared to v2: - Dropped the controversial patch "block: Restrict the duration of sysfs attribute changes".
Changes compared to v1: - Added patch "block: Restrict the duration of sysfs attribute changes". - Remove queue freezing from more sysfs callbacks.
block/blk-sysfs.c | 20 +++----------------- 1 file changed, 3 insertions(+), 17 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 76c47fe9b8d6..6eaccd18d8b4 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -143,7 +143,6 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) { unsigned long ra_kb; ssize_t ret; - unsigned int memflags; struct request_queue *q = disk->queue;
ret = queue_var_store(&ra_kb, page, count); @@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock); - memflags = blk_mq_freeze_queue(q); - disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10); + data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock); - blk_mq_unfreeze_queue(q, memflags);
return ret; } @@ -375,21 +372,18 @@ static ssize_t queue_nomerges_store(struct gendisk *disk, const char *page, size_t count) { unsigned long nm; - unsigned int memflags; struct request_queue *q = disk->queue; ssize_t ret = queue_var_store(&nm, page, count);
if (ret < 0) return ret;
- memflags = blk_mq_freeze_queue(q); blk_queue_flag_clear(QUEUE_FLAG_NOMERGES, q); blk_queue_flag_clear(QUEUE_FLAG_NOXMERGES, q); if (nm == 2) blk_queue_flag_set(QUEUE_FLAG_NOMERGES, q); else if (nm) blk_queue_flag_set(QUEUE_FLAG_NOXMERGES, q); - blk_mq_unfreeze_queue(q, memflags);
return ret; } @@ -409,7 +403,6 @@ queue_rq_affinity_store(struct gendisk *disk, const char *page, size_t count) #ifdef CONFIG_SMP struct request_queue *q = disk->queue; unsigned long val; - unsigned int memflags;
ret = queue_var_store(&val, page, count); if (ret < 0) @@ -421,7 +414,6 @@ queue_rq_affinity_store(struct gendisk *disk, const char *page, size_t count) * are accessed individually using atomic test_bit operation. So we * don't grab any lock while updating these flags. */ - memflags = blk_mq_freeze_queue(q); if (val == 2) { blk_queue_flag_set(QUEUE_FLAG_SAME_COMP, q); blk_queue_flag_set(QUEUE_FLAG_SAME_FORCE, q); @@ -432,7 +424,6 @@ queue_rq_affinity_store(struct gendisk *disk, const char *page, size_t count) blk_queue_flag_clear(QUEUE_FLAG_SAME_COMP, q); blk_queue_flag_clear(QUEUE_FLAG_SAME_FORCE, q); } - blk_mq_unfreeze_queue(q, memflags); #endif return ret; } @@ -446,11 +437,9 @@ static ssize_t queue_poll_delay_store(struct gendisk *disk, const char *page, static ssize_t queue_poll_store(struct gendisk *disk, const char *page, size_t count) { - unsigned int memflags; ssize_t ret = count; struct request_queue *q = disk->queue;
- memflags = blk_mq_freeze_queue(q); if (!(q->limits.features & BLK_FEAT_POLL)) { ret = -EINVAL; goto out; @@ -459,7 +448,6 @@ static ssize_t queue_poll_store(struct gendisk *disk, const char *page, pr_info_ratelimited("writes to the poll attribute are ignored.\n"); pr_info_ratelimited("please use driver specific parameters instead.\n"); out: - blk_mq_unfreeze_queue(q, memflags); return ret; }
@@ -472,7 +460,7 @@ static ssize_t queue_io_timeout_show(struct gendisk *disk, char *page) static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, size_t count) { - unsigned int val, memflags; + unsigned int val; int err; struct request_queue *q = disk->queue;
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL;
- memflags = blk_mq_freeze_queue(q); - blk_queue_rq_timeout(q, msecs_to_jiffies(val)); - blk_mq_unfreeze_queue(q, memflags); + data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0));
return count; }
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 76c47fe9b8d6..6eaccd18d8b4 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -143,7 +143,6 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) { unsigned long ra_kb; ssize_t ret;
- unsigned int memflags; struct request_queue *q = disk->queue;
ret = queue_var_store(&ra_kb, page, count); @@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock);
- memflags = blk_mq_freeze_queue(q);
- disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
- data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock);
- blk_mq_unfreeze_queue(q, memflags);
return ret; }
I think we don't need data_race() here as disk->bdi->ra_pages is already protected by ->limits_lock. Furthermore, while you're at it, I’d suggest protecting the set/get access of ->ra_pages using ->limits_lock when it’s invoked from the ioctl context (BLKRASET/BLKRAGET).
@@ -472,7 +460,7 @@ static ssize_t queue_io_timeout_show(struct gendisk *disk, char *page) static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, size_t count) {
- unsigned int val, memflags;
- unsigned int val; int err; struct request_queue *q = disk->queue;
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL;
- memflags = blk_mq_freeze_queue(q);
- blk_queue_rq_timeout(q, msecs_to_jiffies(val));
- blk_mq_unfreeze_queue(q, memflags);
- data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0));
return count; }
The use of data_race() above seems redundant, since the update to q->rq_timeout is already marked with WRITE_ONCE(). However, the read access to q->rq_timeout in a few places within the I/O hotpath is not marked and instead accessed directly using plain C-language loads.
When I ran this proposed change with KCSAN enabled on my system, I observed the following warning ( when one thread was running I/O and another thread was concurrently updating q->rq_timeout):
BUG: KCSAN: data-race in blk_add_timer+0x74/0x1f0
race at unknown origin, with read to 0xc000000143c9c568 of 4 bytes by task 3725 on cpu 16: blk_add_timer+0x74/0x1f0 blk_mq_start_request+0xe0/0x558 nvme_prep_rq.part.0+0x300/0x194c [nvme] nvme_queue_rqs+0x1f0/0x4e4 [nvme] blk_mq_dispatch_queue_requests+0x510/0x8e0 blk_mq_flush_plug_list+0xc0/0x3d8 __blk_flush_plug+0x29c/0x368 __submit_bio+0x320/0x3bc submit_bio_noacct_nocheck+0x604/0x94c submit_bio_noacct+0x3b4/0xef8 submit_bio+0x7c/0x310 submit_bio_wait+0xd4/0x19c __blkdev_direct_IO_simple+0x290/0x484 blkdev_direct_IO+0x658/0x1000 blkdev_write_iter+0x530/0x67c vfs_write+0x698/0x894 sys_pwrite64+0x140/0x17c system_call_exception+0x1ac/0x520 system_call_vectored_common+0x15c/0x2ec
value changed: 0x00000bb8 -> 0x000005dc
Reported by Kernel Concurrency Sanitizer on: CPU: 16 UID: 0 PID: 3725 Comm: fio Kdump: loaded Not tainted 6.18.0-rc3+ #47 VOLUNTARY Hardware name: IBM,9105-22A Power11 (architected) 0x820200 0xf000007 of:IBM,FW1120.00 (RB1120_115) hv:phyp pSeries ==================================================================
Based on the gdb trace:
(gdb) info line *(blk_add_timer+0x74) Line 138 of "block/blk-timeout.c" starts at address 0xc000000000d5637c <blk_add_timer+108> and ends at 0xc000000000d5638c <blk_add_timer+124>.
This corresponds to:
128 void blk_add_timer(struct request *req) 129 { 130 struct request_queue *q = req->q; 131 unsigned long expiry; 132 133 /* 134 * Some LLDs, like scsi, peek at the timeout to prevent a 135 * command from being retried forever. 136 */ 137 if (!req->timeout) 138 req->timeout = q->rq_timeout;
As seen above, the read access to q->rq_timeout is unmarked. To avoid the reported data race, we should replace this plain access with READ_ONCE(q->rq_timeout). This is one instance in the I/O hotpath where q->rq_timeout is read without annotation, and there are likely a few more similar cases that should be updated in the same way.
Thanks, --Nilay
On 11/6/25 5:01 AM, Nilay Shroff wrote:
@@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock);
- memflags = blk_mq_freeze_queue(q);
- disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
- data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock);
- blk_mq_unfreeze_queue(q, memflags);
return ret; }
I think we don't need data_race() here as disk->bdi->ra_pages is already protected by ->limits_lock. Furthermore, while you're at it, I’d suggest protecting the set/get access of ->ra_pages using ->limits_lock when it’s invoked from the ioctl context (BLKRASET/BLKRAGET).
I think that we really need the data_race() annotation here because there is plenty of code that reads ra_pages without using any locking.
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL;
- memflags = blk_mq_freeze_queue(q);
- blk_queue_rq_timeout(q, msecs_to_jiffies(val));
- blk_mq_unfreeze_queue(q, memflags);
- data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0));
return count; }
The use of data_race() above seems redundant, since the update to q->rq_timeout is already marked with WRITE_ONCE(). However, the read access to q->rq_timeout in a few places within the I/O hotpath is not marked and instead accessed directly using plain C-language loads.
That's fair. I will remove data_race() again from queue_io_timeout_store().
BUG: KCSAN: data-race in blk_add_timer+0x74/0x1f0
Based on the gdb trace:
(gdb) info line *(blk_add_timer+0x74) Line 138 of "block/blk-timeout.c" starts at address 0xc000000000d5637c <blk_add_timer+108> and ends at 0xc000000000d5638c <blk_add_timer+124>.
This corresponds to:
128 void blk_add_timer(struct request *req) 129 { 130 struct request_queue *q = req->q; 131 unsigned long expiry; 132 133 /* 134 * Some LLDs, like scsi, peek at the timeout to prevent a 135 * command from being retried forever. 136 */ 137 if (!req->timeout) 138 req->timeout = q->rq_timeout;
As seen above, the read access to q->rq_timeout is unmarked. To avoid the reported data race, we should replace this plain access with READ_ONCE(q->rq_timeout). This is one instance in the I/O hotpath where q->rq_timeout is read without annotation, and there are likely a few more similar cases that should be updated in the same way.
That's an existing issue and an issue that falls outside the scope of my patch.
Thanks,
Bart.
On 11/7/25 2:19 AM, Bart Van Assche wrote:
On 11/6/25 5:01 AM, Nilay Shroff wrote:
@@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock); - memflags = blk_mq_freeze_queue(q); - disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10); + data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock); - blk_mq_unfreeze_queue(q, memflags); return ret; }
I think we don't need data_race() here as disk->bdi->ra_pages is already protected by ->limits_lock. Furthermore, while you're at it, I’d suggest protecting the set/get access of ->ra_pages using ->limits_lock when it’s invoked from the ioctl context (BLKRASET/BLKRAGET).
I think that we really need the data_race() annotation here because there is plenty of code that reads ra_pages without using any locking.
I believe, in that case we need to annotate both reader and writer, using data_race(). Annotating only writer but not reader would not help suppress KCSAN reports of a data race.
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL; - memflags = blk_mq_freeze_queue(q); - blk_queue_rq_timeout(q, msecs_to_jiffies(val)); - blk_mq_unfreeze_queue(q, memflags); + data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0)); return count; }
The use of data_race() above seems redundant, since the update to q->rq_timeout is already marked with WRITE_ONCE(). However, the read access to q->rq_timeout in a few places within the I/O hotpath is not marked and instead accessed directly using plain C-language loads.
That's fair. I will remove data_race() again from queue_io_timeout_store().
BUG: KCSAN: data-race in blk_add_timer+0x74/0x1f0
Based on the gdb trace:
(gdb) info line *(blk_add_timer+0x74) Line 138 of "block/blk-timeout.c" starts at address 0xc000000000d5637c <blk_add_timer+108> and ends at 0xc000000000d5638c <blk_add_timer+124>.
This corresponds to:
128 void blk_add_timer(struct request *req) 129 { 130 struct request_queue *q = req->q; 131 unsigned long expiry; 132 133 /* 134 * Some LLDs, like scsi, peek at the timeout to prevent a 135 * command from being retried forever. 136 */ 137 if (!req->timeout) 138 req->timeout = q->rq_timeout;
As seen above, the read access to q->rq_timeout is unmarked. To avoid the reported data race, we should replace this plain access with READ_ONCE(q->rq_timeout). This is one instance in the I/O hotpath where q->rq_timeout is read without annotation, and there are likely a few more similar cases that should be updated in the same way.
That's an existing issue and an issue that falls outside the scope of my patch.
Hmm, I don’t think this issue falls outside the scope of the current patch. Without this change, it would not be possible for queue_io_timeout_store() to run concurrently with the I/O hotpath and update q->rq_timeout while it’s being read. Since this patch removes queue freeze from queue_io_timeout_store(), it can now potentially execute concurrently with the I/O hotpath, which could then manifest the KCSAN-reported data race described above.
Thanks,--Nilay
On 11/7/25 2:41 AM, Nilay Shroff wrote:
On 11/7/25 2:19 AM, Bart Van Assche wrote:
On 11/6/25 5:01 AM, Nilay Shroff wrote:
@@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock); - memflags = blk_mq_freeze_queue(q); - disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10); + data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock); - blk_mq_unfreeze_queue(q, memflags); return ret; }
I think we don't need data_race() here as disk->bdi->ra_pages is already protected by ->limits_lock. Furthermore, while you're at it, I’d suggest protecting the set/get access of ->ra_pages using ->limits_lock when it’s invoked from the ioctl context (BLKRASET/BLKRAGET).
I think that we really need the data_race() annotation here because there is plenty of code that reads ra_pages without using any locking.
I believe, in that case we need to annotate both reader and writer, using data_race(). Annotating only writer but not reader would not help suppress KCSAN reports of a data race.
No. As the comment above the data_race() macro explains, the data_race() macro makes memory accesses invisible to KCSAN. Hence, annotating writers only with data_race() is sufficient.
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL; - memflags = blk_mq_freeze_queue(q); - blk_queue_rq_timeout(q, msecs_to_jiffies(val)); - blk_mq_unfreeze_queue(q, memflags); + data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0)); return count; }
The use of data_race() above seems redundant, since the update to q->rq_timeout is already marked with WRITE_ONCE(). However, the read access to q->rq_timeout in a few places within the I/O hotpath is not marked and instead accessed directly using plain C-language loads.
That's an existing issue and an issue that falls outside the scope of my patch.
Hmm, I don’t think this issue falls outside the scope of the current patch. Without this change, it would not be possible for queue_io_timeout_store() to run concurrently with the I/O hotpath and update q->rq_timeout while it’s being read. Since this patch removes queue freeze from queue_io_timeout_store(), it can now potentially execute concurrently with the I/O hotpath, which could then manifest the KCSAN-reported data race described above.
Annotating all rq_timeout read accesses with READ_ONCE() would be cumbersome because there are plenty of direct rq_timeout accesses outside the block layer, e.g. in drivers/scsi/st.c (SCSI tape driver).
I prefer an alternative approach: annotate the q->rq_timeout update in blk_queue_rq_timeout() with both data_race() and WRITE_ONCE(). That guarantees that rq_timeout update happens once and prevents that KCSAN complains about rq_timeout reads.
Thanks,
Bart.
On 11/7/25 9:48 PM, Bart Van Assche wrote:
On 11/7/25 2:41 AM, Nilay Shroff wrote:
On 11/7/25 2:19 AM, Bart Van Assche wrote:
On 11/6/25 5:01 AM, Nilay Shroff wrote:
@@ -154,10 +153,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count) * calculated from the queue limits by queue_limits_commit_update. */ mutex_lock(&q->limits_lock); - memflags = blk_mq_freeze_queue(q); - disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10); + data_race(disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10)); mutex_unlock(&q->limits_lock); - blk_mq_unfreeze_queue(q, memflags); return ret; }
I think we don't need data_race() here as disk->bdi->ra_pages is already protected by ->limits_lock. Furthermore, while you're at it, I’d suggest protecting the set/get access of ->ra_pages using ->limits_lock when it’s invoked from the ioctl context (BLKRASET/BLKRAGET).
I think that we really need the data_race() annotation here because there is plenty of code that reads ra_pages without using any locking.
I believe, in that case we need to annotate both reader and writer, using data_race(). Annotating only writer but not reader would not help suppress KCSAN reports of a data race.
No. As the comment above the data_race() macro explains, the data_race() macro makes memory accesses invisible to KCSAN. Hence, annotating writers only with data_race() is sufficient.
If the data access is marked using one of READ_ONCE(), WRITE_ONCE(), or data_race(), then KCSAN would not instrument that access. However, plain accesses are always instrumented.
So, if we only mark the writers but keep the readers as plain accesses, KCSAN would likely report a race at unknown origin — because the plain read access is still being watched, and when the variable’s value changes, it’s detected as a data race. This is exactly what I observed and reported earlier with this change.
@@ -480,9 +468,7 @@ static ssize_t queue_io_timeout_store(struct gendisk *disk, const char *page, if (err || val == 0) return -EINVAL; - memflags = blk_mq_freeze_queue(q); - blk_queue_rq_timeout(q, msecs_to_jiffies(val)); - blk_mq_unfreeze_queue(q, memflags); + data_race((blk_queue_rq_timeout(q, msecs_to_jiffies(val)), 0)); return count; }
The use of data_race() above seems redundant, since the update to q->rq_timeout is already marked with WRITE_ONCE(). However, the read access to q->rq_timeout in a few places within the I/O hotpath is not marked and instead accessed directly using plain C-language loads.
That's an existing issue and an issue that falls outside the scope of my patch.
Hmm, I don’t think this issue falls outside the scope of the current patch. Without this change, it would not be possible for queue_io_timeout_store() to run concurrently with the I/O hotpath and update q->rq_timeout while it’s being read. Since this patch removes queue freeze from queue_io_timeout_store(), it can now potentially execute concurrently with the I/O hotpath, which could then manifest the KCSAN-reported data race described above.
Annotating all rq_timeout read accesses with READ_ONCE() would be cumbersome because there are plenty of direct rq_timeout accesses outside the block layer, e.g. in drivers/scsi/st.c (SCSI tape driver).
Yes I agree there're plenty of them.
I prefer an alternative approach: annotate the q->rq_timeout update in blk_queue_rq_timeout() with both data_race() and WRITE_ONCE(). That guarantees that rq_timeout update happens once and prevents that KCSAN complains about rq_timeout reads.
Again if we avoid marking read access then as I mentioned above it'd cause a race at unknown origin. That said, you may spin a new patch with the change (as you see fit) and I'll help test and review it. Thanks, --Nilay
On 11/9/25 2:37 AM, Nilay Shroff wrote:
On 11/7/25 9:48 PM, Bart Van Assche wrote:
No. As the comment above the data_race() macro explains, the data_race() macro makes memory accesses invisible to KCSAN. Hence, annotating writers only with data_race() is sufficient.
If the data access is marked using one of READ_ONCE(), WRITE_ONCE(), or data_race(), then KCSAN would not instrument that access. However, plain accesses are always instrumented.
So, if we only mark the writers but keep the readers as plain accesses, KCSAN would likely report a race at unknown origin — because the plain read access is still being watched, and when the variable’s value changes, it’s detected as a data race. This is exactly what I observed and reported earlier with this change.
KCSAN watchpoints do not include the value of a memory location. Hence, I think that the above conclusion that KCSAN would report data races against writes marked with data_race() is wrong.
Bart.
linux-stable-mirror@lists.linaro.org