On Wed, 2018-04-04 at 17:54 +0900, Damien Le Moal wrote:
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is called from sd_probe_async() and since sd_revalidate_disk() calls sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called concurrently with operations referencing a drive zone bitmaps and number
^^^^^^^^^^^^^^^^^^^^
Should "a" be changed into "the"?
[Damien] Updated commit message and changed nr_zones/bitmap swap order.
Updating the number of zones after having updated the bitmap pointers is not sufficient to avoid trouble if the number of zones as reported by the drive changes while I/O is in progress. With the current implementation if the number of zones changes the seq_zones_bitmap is cleared. Can this cause trouble for the mq-deadline scheduler? Additionally, CPUs other than x86 can reorder store operations. Even worse, a CPU could cache the zone bitmap pointers which means that at least RCU protection + kfree_rcu() is needed to avoid trouble. I think we either should handle this case properly or issue a kernel warning.
Thanks,
Bart.