From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit 81438498a285759f31e843ac4800f82a5ce6521f ]
Because of integer division, we need to carefully calculate the disk offset. Consider the example below for a stripe of 6 volumes, a chunk size of 4096, and an offset of 70000.
chunk = div_u64(offset, dev->chunk_size) = 70000 / 4096 = 17 offset = chunk * dev->chunk_size = 17 * 4096 = 69632 disk_offset_wrong = div_u64(offset, dev->nr_children) = 69632 / 6 = 11605 disk_chunk = div_u64(chunk, dev->nr_children) = 17 / 6 = 2 disk_offset = disk_chunk * dev->chunk_size = 2 * 4096 = 8192
Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/20250701122341.199112-1-sergeybashirov@gmail.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
## Bug Fix Nature The commit fixes a critical mathematical calculation error in the pNFS block/SCSI layout stripe mapping function. The bug causes incorrect disk offset calculations due to improper integer division, which could lead to data corruption or incorrect data access in striped pNFS block layouts.
## Specific Code Analysis
The bug is in the `bl_map_stripe()` function at line 276 of the original code: ```c disk_offset = div_u64(offset, dev->nr_children); ```
This calculation is fundamentally wrong. The commit message provides a clear example showing how the bug manifests: - With 6 volumes, chunk size 4096, and offset 70000 - The wrong calculation yields disk_offset = 11605 - The correct calculation yields disk_offset = 8192
The fix changes the calculation to: ```c disk_chunk = div_u64_rem(chunk, dev->nr_children, &chunk_idx); disk_offset = disk_chunk * dev->chunk_size; ```
This properly calculates which chunk on the specific disk should be accessed by: 1. First determining the disk_chunk number (which chunk on the target disk) 2. Then multiplying by chunk_size to get the actual byte offset
## Stable Backport Criteria Met
1. **Fixes a real bug**: Yes - incorrect stripe offset calculation leading to wrong data access 2. **Small and contained**: Yes - only 3 lines changed, localized to one function 3. **No major architectural changes**: Correct - simple math fix 4. **Low regression risk**: Yes - the new calculation is mathematically correct and doesn't change any interfaces 5. **Important for users**: Yes - data corruption/access issues in pNFS deployments are serious
## Historical Context Looking at the git history, this area has had previous stable-worthy fixes: - Commit 0914bb965e38 fixed an off-by-one error and was explicitly marked for stable (Cc: stable@vger.kernel.org # 3.17+) - Commit 5466112f0935 fixed 64-bit division issues in the same function
This indicates that `bl_map_stripe()` is a critical function that has needed careful attention for correctness, and fixes to it have historically been considered stable-worthy.
## Impact Assessment Without this fix, any system using pNFS block layout with striping could experience: - Data written to wrong disk locations - Data read from wrong disk locations - Potential data corruption or loss
The fix is essential for correct operation of pNFS block layouts with striping configurations.
fs/nfs/blocklayout/dev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c index cab8809f0e0f..44306ac22353 100644 --- a/fs/nfs/blocklayout/dev.c +++ b/fs/nfs/blocklayout/dev.c @@ -257,10 +257,11 @@ static bool bl_map_stripe(struct pnfs_block_dev *dev, u64 offset, struct pnfs_block_dev *child; u64 chunk; u32 chunk_idx; + u64 disk_chunk; u64 disk_offset;
chunk = div_u64(offset, dev->chunk_size); - div_u64_rem(chunk, dev->nr_children, &chunk_idx); + disk_chunk = div_u64_rem(chunk, dev->nr_children, &chunk_idx);
if (chunk_idx >= dev->nr_children) { dprintk("%s: invalid chunk idx %d (%lld/%lld)\n", @@ -273,7 +274,7 @@ static bool bl_map_stripe(struct pnfs_block_dev *dev, u64 offset, offset = chunk * dev->chunk_size;
/* disk offset of the stripe */ - disk_offset = div_u64(offset, dev->nr_children); + disk_offset = disk_chunk * dev->chunk_size;
child = &dev->children[chunk_idx]; child->map(child, disk_offset, map);