From: Zijun Hu <quic_zijuhu(a)quicinc.com>
[ Upstream commit 1363c134ade81e425873b410566e957fecebb261 ]
fs_name() has @index as unsigned int, so there is underflow risk for
operation '@index--'.
Fix by breaking the for loop when '@index == 0' which is also more proper
than '@index <= 0' for unsigned integer comparison.
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
Link: https://lore.kernel.org/20250410-fix_fs-v1-1-7c14ccc8ebaa@quicinc.com
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
**YES** This commit should be backported to stable kernel trees.
**Detailed Analysis:** **The Bug:** The `fs_name()` function at
`fs/filesystems.c:156-174` has a critical unsigned integer underflow
vulnerability. When the function receives `index=0` as a parameter, the
loop `for (tmp = file_systems; tmp; tmp = tmp->next, index--)`
decrements `index` from 0 to `UINT_MAX` (4294967295 on 32-bit systems),
causing the condition `if (index <= 0 && try_module_get(tmp->owner))` to
evaluate incorrectly. **The Fix:** The commit changes the logic from: -
Old: `if (index <= 0 && try_module_get(tmp->owner))` - New: `if (index
== 0) { if (try_module_get(tmp->owner)) res = 0; break; }` This prevents
the unsigned integer from wrapping around and provides proper bounds
checking. **Impact and Severity:** 1. **User-accessible vulnerability**:
The `fs_name()` function is called through the `sysfs` system call
(syscall #139) with option 2, making it directly accessible to userspace
applications. 2. **Potential for exploitation**: An attacker could call
`sysfs(2, 0, buffer)` to trigger the underflow, potentially causing: -
Infinite loops in the filesystem list traversal - Unintended module
reference acquisition - System instability or denial of service 3.
**Core filesystem subsystem**: This affects the fundamental filesystem
registration mechanism in the kernel. **Comparison with Similar
Commits:** This follows the same pattern as the **accepted backport
examples**: - **Similar to Commit #1 (ntfs3)**: Both fix integer
overflow/underflow issues that could cause system instability -
**Similar to Commit #3 (f2fs)**: Both prevent integer arithmetic issues
in filesystem code - **Similar to Commit #5 (f2fs)**: Both add bounds
checking to prevent corruption **Stable Tree Criteria:** ✅ **Fixes
important bug**: Prevents potential system instability and undefined
behavior ✅ **Small and contained**: Minimal code change, only affects
one function ✅ **Clear side effects**: No architectural changes, just
safer bounds checking ✅ **Low regression risk**: The fix makes the
function more robust without changing expected behavior ✅ **Critical
subsystem**: Filesystem management is fundamental to kernel operation
**Conclusion:** This is a textbook example of a commit suitable for
stable backporting: it fixes a clear bug with security implications in
core kernel infrastructure, uses a minimal and safe approach, and has no
risk of introducing regressions. The unsigned integer underflow could
lead to system instability when triggered through the accessible `sysfs`
syscall.
fs/filesystems.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/fs/filesystems.c b/fs/filesystems.c
index 58b9067b2391c..95e5256821a53 100644
--- a/fs/filesystems.c
+++ b/fs/filesystems.c
@@ -156,15 +156,19 @@ static int fs_index(const char __user * __name)
static int fs_name(unsigned int index, char __user * buf)
{
struct file_system_type * tmp;
- int len, res;
+ int len, res = -EINVAL;
read_lock(&file_systems_lock);
- for (tmp = file_systems; tmp; tmp = tmp->next, index--)
- if (index <= 0 && try_module_get(tmp->owner))
+ for (tmp = file_systems; tmp; tmp = tmp->next, index--) {
+ if (index == 0) {
+ if (try_module_get(tmp->owner))
+ res = 0;
break;
+ }
+ }
read_unlock(&file_systems_lock);
- if (!tmp)
- return -EINVAL;
+ if (res)
+ return res;
/* OK, we got the reference, so we can safely block */
len = strlen(tmp->name) + 1;
--
2.39.5
From: Filipe Manana <fdmanana(a)suse.com>
[ Upstream commit 3bf179e36da917c5d9bec71c714573ed1649b7c1 ]
If insert_state() state failed it returns an error pointer and we call
extent_io_tree_panic() which will trigger a BUG() call. However if
CONFIG_BUG is disabled, which is an uncommon and exotic scenario, then
we fallthrough and call cache_state() which will dereference the error
pointer, resulting in an invalid memory access.
So jump to the 'out' label after calling extent_io_tree_panic(), it also
makes the code more clear besides dealing with the exotic scenario where
CONFIG_BUG is disabled.
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
**YES** This commit should be backported to stable kernel trees.
**Detailed Analysis:** **1. Bug Description and Impact:** The commit
fixes a serious potential memory corruption bug in the btrfs filesystem.
When `insert_state()` fails in `btrfs_convert_extent_bit()`, it returns
an error pointer (ERR_PTR). The current code calls
`extent_io_tree_panic()` which triggers `BUG()`, but if `CONFIG_BUG` is
disabled, the code continues executing and calls
`cache_state(inserted_state, cached_state)` with the error pointer,
causing invalid memory access. **2. Code Changes Analysis:** The fix is
minimal and surgical - just adding a single `goto out;` statement after
the `extent_io_tree_panic()` call: ```c if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state); extent_io_tree_panic(tree, prealloc,
"insert", ret); + goto out; // <-- The fix } ``` This ensures that when
`CONFIG_BUG` is disabled, execution jumps to the cleanup code instead of
continuing with an invalid pointer. **3. Comparison with Similar
Commits:** This commit aligns with the pattern seen in "Similar Commit
#2" (Status: YES), which also: - Removes reliance on `BUG_ON()` behavior
- Provides graceful error handling - Has minimal risk - Fixes a
potential crash/corruption scenario Similar to commit #3 and #5 (both
Status: NO), this touches BUG() handling, but unlike those commits which
make broader architectural changes to error handling patterns, this fix
is much more contained. **4. Stable Tree Criteria Assessment:** ✅
**Fixes important bug**: Prevents potential memory corruption/crashes ✅
**Small and contained**: Single line addition ✅ **Minimal risk**: Only
affects error path when insert_state() fails AND CONFIG_BUG is disabled
✅ **No new features**: Pure bug fix ✅ **No architectural changes**:
Preserves existing error handling, just prevents fallthrough ✅
**Critical subsystem**: btrfs filesystem corruption prevention ✅ **Clear
side effects**: No unintended consequences beyond fixing the bug **5.
Risk Assessment:** - **Very Low Risk**: The change only affects an error
condition that's already problematic - **Exotic scenario**: Only impacts
systems with `CONFIG_BUG` disabled (uncommon but not impossible) - **No
regression potential**: The change only prevents executing invalid code,
doesn't change normal operation - **Well-contained**: Affects only one
function in one file **6. Security Implications:** While `CONFIG_BUG`
disabled is uncommon, this could potentially be exploited if an attacker
can trigger the `insert_state()` failure condition, leading to memory
corruption. The fix prevents this attack vector. This is a clear
candidate for stable backporting - it fixes a real bug with minimal risk
and follows the stable tree rules perfectly.
fs/btrfs/extent-io-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
index 6d08c100b01de..bb3aaf610652a 100644
--- a/fs/btrfs/extent-io-tree.c
+++ b/fs/btrfs/extent-io-tree.c
@@ -1456,6 +1456,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state);
extent_io_tree_panic(tree, prealloc, "insert", ret);
+ goto out;
}
cache_state(inserted_state, cached_state);
if (inserted_state == prealloc)
--
2.39.5
From: Filipe Manana <fdmanana(a)suse.com>
[ Upstream commit 3bf179e36da917c5d9bec71c714573ed1649b7c1 ]
If insert_state() state failed it returns an error pointer and we call
extent_io_tree_panic() which will trigger a BUG() call. However if
CONFIG_BUG is disabled, which is an uncommon and exotic scenario, then
we fallthrough and call cache_state() which will dereference the error
pointer, resulting in an invalid memory access.
So jump to the 'out' label after calling extent_io_tree_panic(), it also
makes the code more clear besides dealing with the exotic scenario where
CONFIG_BUG is disabled.
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
**YES** This commit should be backported to stable kernel trees.
**Detailed Analysis:** **1. Bug Description and Impact:** The commit
fixes a serious potential memory corruption bug in the btrfs filesystem.
When `insert_state()` fails in `btrfs_convert_extent_bit()`, it returns
an error pointer (ERR_PTR). The current code calls
`extent_io_tree_panic()` which triggers `BUG()`, but if `CONFIG_BUG` is
disabled, the code continues executing and calls
`cache_state(inserted_state, cached_state)` with the error pointer,
causing invalid memory access. **2. Code Changes Analysis:** The fix is
minimal and surgical - just adding a single `goto out;` statement after
the `extent_io_tree_panic()` call: ```c if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state); extent_io_tree_panic(tree, prealloc,
"insert", ret); + goto out; // <-- The fix } ``` This ensures that when
`CONFIG_BUG` is disabled, execution jumps to the cleanup code instead of
continuing with an invalid pointer. **3. Comparison with Similar
Commits:** This commit aligns with the pattern seen in "Similar Commit
#2" (Status: YES), which also: - Removes reliance on `BUG_ON()` behavior
- Provides graceful error handling - Has minimal risk - Fixes a
potential crash/corruption scenario Similar to commit #3 and #5 (both
Status: NO), this touches BUG() handling, but unlike those commits which
make broader architectural changes to error handling patterns, this fix
is much more contained. **4. Stable Tree Criteria Assessment:** ✅
**Fixes important bug**: Prevents potential memory corruption/crashes ✅
**Small and contained**: Single line addition ✅ **Minimal risk**: Only
affects error path when insert_state() fails AND CONFIG_BUG is disabled
✅ **No new features**: Pure bug fix ✅ **No architectural changes**:
Preserves existing error handling, just prevents fallthrough ✅
**Critical subsystem**: btrfs filesystem corruption prevention ✅ **Clear
side effects**: No unintended consequences beyond fixing the bug **5.
Risk Assessment:** - **Very Low Risk**: The change only affects an error
condition that's already problematic - **Exotic scenario**: Only impacts
systems with `CONFIG_BUG` disabled (uncommon but not impossible) - **No
regression potential**: The change only prevents executing invalid code,
doesn't change normal operation - **Well-contained**: Affects only one
function in one file **6. Security Implications:** While `CONFIG_BUG`
disabled is uncommon, this could potentially be exploited if an attacker
can trigger the `insert_state()` failure condition, leading to memory
corruption. The fix prevents this attack vector. This is a clear
candidate for stable backporting - it fixes a real bug with minimal risk
and follows the stable tree rules perfectly.
fs/btrfs/extent-io-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
index 6d08c100b01de..bb3aaf610652a 100644
--- a/fs/btrfs/extent-io-tree.c
+++ b/fs/btrfs/extent-io-tree.c
@@ -1456,6 +1456,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state);
extent_io_tree_panic(tree, prealloc, "insert", ret);
+ goto out;
}
cache_state(inserted_state, cached_state);
if (inserted_state == prealloc)
--
2.39.5
From: Filipe Manana <fdmanana(a)suse.com>
[ Upstream commit 3bf179e36da917c5d9bec71c714573ed1649b7c1 ]
If insert_state() state failed it returns an error pointer and we call
extent_io_tree_panic() which will trigger a BUG() call. However if
CONFIG_BUG is disabled, which is an uncommon and exotic scenario, then
we fallthrough and call cache_state() which will dereference the error
pointer, resulting in an invalid memory access.
So jump to the 'out' label after calling extent_io_tree_panic(), it also
makes the code more clear besides dealing with the exotic scenario where
CONFIG_BUG is disabled.
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
**YES** This commit should be backported to stable kernel trees.
**Detailed Analysis:** **1. Bug Description and Impact:** The commit
fixes a serious potential memory corruption bug in the btrfs filesystem.
When `insert_state()` fails in `btrfs_convert_extent_bit()`, it returns
an error pointer (ERR_PTR). The current code calls
`extent_io_tree_panic()` which triggers `BUG()`, but if `CONFIG_BUG` is
disabled, the code continues executing and calls
`cache_state(inserted_state, cached_state)` with the error pointer,
causing invalid memory access. **2. Code Changes Analysis:** The fix is
minimal and surgical - just adding a single `goto out;` statement after
the `extent_io_tree_panic()` call: ```c if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state); extent_io_tree_panic(tree, prealloc,
"insert", ret); + goto out; // <-- The fix } ``` This ensures that when
`CONFIG_BUG` is disabled, execution jumps to the cleanup code instead of
continuing with an invalid pointer. **3. Comparison with Similar
Commits:** This commit aligns with the pattern seen in "Similar Commit
#2" (Status: YES), which also: - Removes reliance on `BUG_ON()` behavior
- Provides graceful error handling - Has minimal risk - Fixes a
potential crash/corruption scenario Similar to commit #3 and #5 (both
Status: NO), this touches BUG() handling, but unlike those commits which
make broader architectural changes to error handling patterns, this fix
is much more contained. **4. Stable Tree Criteria Assessment:** ✅
**Fixes important bug**: Prevents potential memory corruption/crashes ✅
**Small and contained**: Single line addition ✅ **Minimal risk**: Only
affects error path when insert_state() fails AND CONFIG_BUG is disabled
✅ **No new features**: Pure bug fix ✅ **No architectural changes**:
Preserves existing error handling, just prevents fallthrough ✅
**Critical subsystem**: btrfs filesystem corruption prevention ✅ **Clear
side effects**: No unintended consequences beyond fixing the bug **5.
Risk Assessment:** - **Very Low Risk**: The change only affects an error
condition that's already problematic - **Exotic scenario**: Only impacts
systems with `CONFIG_BUG` disabled (uncommon but not impossible) - **No
regression potential**: The change only prevents executing invalid code,
doesn't change normal operation - **Well-contained**: Affects only one
function in one file **6. Security Implications:** While `CONFIG_BUG`
disabled is uncommon, this could potentially be exploited if an attacker
can trigger the `insert_state()` failure condition, leading to memory
corruption. The fix prevents this attack vector. This is a clear
candidate for stable backporting - it fixes a real bug with minimal risk
and follows the stable tree rules perfectly.
fs/btrfs/extent-io-tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
index 13de6af279e52..92cfde37b1d33 100644
--- a/fs/btrfs/extent-io-tree.c
+++ b/fs/btrfs/extent-io-tree.c
@@ -1456,6 +1456,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
if (IS_ERR(inserted_state)) {
ret = PTR_ERR(inserted_state);
extent_io_tree_panic(tree, prealloc, "insert", ret);
+ goto out;
}
cache_state(inserted_state, cached_state);
if (inserted_state == prealloc)
--
2.39.5
There is ABBA dead locking scenario happening between hugetlb_fault()
and hugetlb_wp() on the pagecache folio's lock and hugetlb global mutex,
which is reproducible with syzkaller [1]. As below stack traces reveal,
process-1 tries to take the hugetlb global mutex (A3), but with the
pagecache folio's lock hold. Process-2 took the hugetlb global mutex but
tries to take the pagecache folio's lock.
Process-1 Process-2
========= =========
hugetlb_fault
mutex_lock (A1)
filemap_lock_hugetlb_folio (B1)
hugetlb_wp
alloc_hugetlb_folio #error
mutex_unlock (A2)
hugetlb_fault
mutex_lock (A4)
filemap_lock_hugetlb_folio (B4)
unmap_ref_private
mutex_lock (A3)
Fix it by releasing the pagecache folio's lock at (A2) of process-1 so
that pagecache folio's lock is available to process-2 at (B4), to avoid
the deadlock. In process-1, a new variable is added to track if the
pagecache folio's lock has been released by its child function
hugetlb_wp() to avoid double releases on the lock in hugetlb_fault().
The similar changes are applied to hugetlb_no_page().
Link: https://drive.google.com/file/d/1DVRnIW-vSayU5J1re9Ct_br3jJQU6Vpb/view?usp=… [1]
Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Cc: <stable(a)vger.kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Florent Revest <revest(a)google.com>
Reviewed-by: Gavin Shan <gshan(a)redhat.com>
Signed-off-by: Gavin Guo <gavinguo(a)igalia.com>
---
V1 -> V2
Suggested-by Oscar Salvador:
- Use folio_test_locked to replace the unnecessary parameter passing.
V2 -> V3
- Dropped the approach suggested by Oscar.
- Refine the code and git commit suggested by Gavin Shan.
mm/hugetlb.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a3cf7935c14..560b9b35262a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6137,7 +6137,8 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma,
* Keep the pte_same checks anyway to make transition from the mutex easier.
*/
static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
- struct vm_fault *vmf)
+ struct vm_fault *vmf,
+ bool *pagecache_folio_locked)
{
struct vm_area_struct *vma = vmf->vma;
struct mm_struct *mm = vma->vm_mm;
@@ -6234,6 +6235,18 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio,
u32 hash;
folio_put(old_folio);
+ /*
+ * The pagecache_folio has to be unlocked to avoid
+ * deadlock and we won't re-lock it in hugetlb_wp(). The
+ * pagecache_folio could be truncated after being
+ * unlocked. So its state should not be reliable
+ * subsequently.
+ */
+ if (pagecache_folio) {
+ folio_unlock(pagecache_folio);
+ if (pagecache_folio_locked)
+ *pagecache_folio_locked = false;
+ }
/*
* Drop hugetlb_fault_mutex and vma_lock before
* unmapping. unmapping needs to hold vma_lock
@@ -6588,7 +6601,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
hugetlb_count_add(pages_per_huge_page(h), mm);
if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
/* Optimization, do the COW without a second fault */
- ret = hugetlb_wp(folio, vmf);
+ ret = hugetlb_wp(folio, vmf, NULL);
}
spin_unlock(vmf->ptl);
@@ -6660,6 +6673,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
struct hstate *h = hstate_vma(vma);
struct address_space *mapping;
int need_wait_lock = 0;
+ bool pagecache_folio_locked = true;
struct vm_fault vmf = {
.vma = vma,
.address = address & huge_page_mask(h),
@@ -6814,7 +6828,8 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) {
if (!huge_pte_write(vmf.orig_pte)) {
- ret = hugetlb_wp(pagecache_folio, &vmf);
+ ret = hugetlb_wp(pagecache_folio, &vmf,
+ &pagecache_folio_locked);
goto out_put_page;
} else if (likely(flags & FAULT_FLAG_WRITE)) {
vmf.orig_pte = huge_pte_mkdirty(vmf.orig_pte);
@@ -6832,7 +6847,9 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
spin_unlock(vmf.ptl);
if (pagecache_folio) {
- folio_unlock(pagecache_folio);
+ if (pagecache_folio_locked)
+ folio_unlock(pagecache_folio);
+
folio_put(pagecache_folio);
}
out_mutex:
base-commit: 914873bc7df913db988284876c16257e6ab772c6
--
2.43.0
echo_skb_max should define the supported upper limit of echo_skb[]
allocated inside the netdevice's priv. The corresponding size value
provided by this driver to alloc_candev() is KVASER_PCIEFD_CAN_TX_MAX_COUNT
which is 17.
But later echo_skb_max is rounded up to the nearest power of two (for the
max case, that would be 32) and the tx/ack indices calculated further
during tx/rx may exceed the upper array boundary. Kasan reported this for
the ack case inside kvaser_pciefd_handle_ack_packet(), though the xmit
function has actually caught the same thing earlier.
BUG: KASAN: slab-out-of-bounds in kvaser_pciefd_handle_ack_packet+0x2d7/0x92a drivers/net/can/kvaser_pciefd.c:1528
Read of size 8 at addr ffff888105e4f078 by task swapper/4/0
CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.15.0 #12 PREEMPT(voluntary)
Call Trace:
<IRQ>
dump_stack_lvl lib/dump_stack.c:122
print_report mm/kasan/report.c:521
kasan_report mm/kasan/report.c:634
kvaser_pciefd_handle_ack_packet drivers/net/can/kvaser_pciefd.c:1528
kvaser_pciefd_read_packet drivers/net/can/kvaser_pciefd.c:1605
kvaser_pciefd_read_buffer drivers/net/can/kvaser_pciefd.c:1656
kvaser_pciefd_receive_irq drivers/net/can/kvaser_pciefd.c:1684
kvaser_pciefd_irq_handler drivers/net/can/kvaser_pciefd.c:1733
__handle_irq_event_percpu kernel/irq/handle.c:158
handle_irq_event kernel/irq/handle.c:210
handle_edge_irq kernel/irq/chip.c:833
__common_interrupt arch/x86/kernel/irq.c:296
common_interrupt arch/x86/kernel/irq.c:286
</IRQ>
Tx max count definitely matters for kvaser_pciefd_tx_avail(), but for seq
numbers' generation that's not the case - we're free to calculate them as
would be more convenient, not taking tx max count into account. The only
downside is that the size of echo_skb[] should correspond to the max seq
number (not tx max count), so in some situations a bit more memory would
be consumed than could be.
Thus make the size of the underlying echo_skb[] sufficient for the rounded
max tx value.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 8256e0ca6010 ("can: kvaser_pciefd: Fix echo_skb race")
Cc: stable(a)vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
---
v2: fix the problem by rounding up the KVASER_PCIEFD_CAN_TX_MAX_COUNT
constant when allocating candev (Axel Forsman)
drivers/net/can/kvaser_pciefd.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/can/kvaser_pciefd.c b/drivers/net/can/kvaser_pciefd.c
index f6921368cd14..0071a51ce2c1 100644
--- a/drivers/net/can/kvaser_pciefd.c
+++ b/drivers/net/can/kvaser_pciefd.c
@@ -966,7 +966,7 @@ static int kvaser_pciefd_setup_can_ctrls(struct kvaser_pciefd *pcie)
u32 status, tx_nr_packets_max;
netdev = alloc_candev(sizeof(struct kvaser_pciefd_can),
- KVASER_PCIEFD_CAN_TX_MAX_COUNT);
+ roundup_pow_of_two(KVASER_PCIEFD_CAN_TX_MAX_COUNT));
if (!netdev)
return -ENOMEM;
@@ -995,7 +995,6 @@ static int kvaser_pciefd_setup_can_ctrls(struct kvaser_pciefd *pcie)
can->tx_max_count = min(KVASER_PCIEFD_CAN_TX_MAX_COUNT, tx_nr_packets_max - 1);
can->can.clock.freq = pcie->freq;
- can->can.echo_skb_max = roundup_pow_of_two(can->tx_max_count);
spin_lock_init(&can->lock);
can->can.bittiming_const = &kvaser_pciefd_bittiming_const;
--
2.49.0
echo_skb_max should define the supported upper limit of echo_skb[]
allocated inside the netdevice's priv. The corresponding size value
provided by this driver to alloc_candev() is KVASER_PCIEFD_CAN_TX_MAX_COUNT
which is 17.
But later echo_skb_max is rounded up to the nearest power of two (for the
max case, that would be 32) and the tx/ack indices calculated further
during tx/rx may exceed the upper array boundary. Kasan reported this for
the ack case inside kvaser_pciefd_handle_ack_packet(), though the xmit
function has actually caught the same thing earlier.
BUG: KASAN: slab-out-of-bounds in kvaser_pciefd_handle_ack_packet+0x2d7/0x92a drivers/net/can/kvaser_pciefd.c:1528
Read of size 8 at addr ffff888105e4f078 by task swapper/4/0
CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.15.0 #12 PREEMPT(voluntary)
Call Trace:
<IRQ>
dump_stack_lvl lib/dump_stack.c:122
print_report mm/kasan/report.c:521
kasan_report mm/kasan/report.c:634
kvaser_pciefd_handle_ack_packet drivers/net/can/kvaser_pciefd.c:1528
kvaser_pciefd_read_packet drivers/net/can/kvaser_pciefd.c:1605
kvaser_pciefd_read_buffer drivers/net/can/kvaser_pciefd.c:1656
kvaser_pciefd_receive_irq drivers/net/can/kvaser_pciefd.c:1684
kvaser_pciefd_irq_handler drivers/net/can/kvaser_pciefd.c:1733
__handle_irq_event_percpu kernel/irq/handle.c:158
handle_irq_event kernel/irq/handle.c:210
handle_edge_irq kernel/irq/chip.c:833
__common_interrupt arch/x86/kernel/irq.c:296
common_interrupt arch/x86/kernel/irq.c:286
</IRQ>
Remove echo_skb_max rounding as this may increase it to unexpected values.
In this sense restore echo_skb_max handling logic prior to commit
8256e0ca6010 ("can: kvaser_pciefd: Fix echo_skb race").
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 8256e0ca6010 ("can: kvaser_pciefd: Fix echo_skb race")
Cc: stable(a)vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
---
Actually the trick with rounding up allows to calculate seq numbers
efficiently, avoiding a more consuming 'mod' operation used in the
current patch.
I see tx max size definitely matters only for kvaser_pciefd_tx_avail(),
but for seq numbers' generation that's not the case - we're free to
calculate them as would be more convenient, not taking tx max size into
account. The only downside is that the size of echo_skb[] should
correspond to the max seq number (not tx max number), so in some
situations a bit more memory would be consumed than could be.
So another approach to fix the problem would be to precompute the rounded
up value of echo_skb_max and pass it to alloc_candev() making the size of
the underlying echo_skb[] sufficient.
If that looks more acceptable, I'll be glad to rework the patch in that
direction.
drivers/net/can/kvaser_pciefd.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/net/can/kvaser_pciefd.c b/drivers/net/can/kvaser_pciefd.c
index f6921368cd14..1ec4ab9510b6 100644
--- a/drivers/net/can/kvaser_pciefd.c
+++ b/drivers/net/can/kvaser_pciefd.c
@@ -411,7 +411,6 @@ struct kvaser_pciefd_can {
void __iomem *reg_base;
struct can_berr_counter bec;
u8 cmd_seq;
- u8 tx_max_count;
u8 tx_idx;
u8 ack_idx;
int err_rep_cnt;
@@ -760,7 +759,7 @@ static int kvaser_pciefd_stop(struct net_device *netdev)
static unsigned int kvaser_pciefd_tx_avail(const struct kvaser_pciefd_can *can)
{
- return can->tx_max_count - (READ_ONCE(can->tx_idx) - READ_ONCE(can->ack_idx));
+ return can->can.echo_skb_max - (READ_ONCE(can->tx_idx) - READ_ONCE(can->ack_idx));
}
static int kvaser_pciefd_prepare_tx_packet(struct kvaser_pciefd_tx_packet *p,
@@ -810,7 +809,7 @@ static netdev_tx_t kvaser_pciefd_start_xmit(struct sk_buff *skb,
{
struct kvaser_pciefd_can *can = netdev_priv(netdev);
struct kvaser_pciefd_tx_packet packet;
- unsigned int seq = can->tx_idx & (can->can.echo_skb_max - 1);
+ unsigned int seq = can->tx_idx % can->can.echo_skb_max;
unsigned int frame_len;
int nr_words;
@@ -992,10 +991,9 @@ static int kvaser_pciefd_setup_can_ctrls(struct kvaser_pciefd *pcie)
tx_nr_packets_max =
FIELD_GET(KVASER_PCIEFD_KCAN_TX_NR_PACKETS_MAX_MASK,
ioread32(can->reg_base + KVASER_PCIEFD_KCAN_TX_NR_PACKETS_REG));
- can->tx_max_count = min(KVASER_PCIEFD_CAN_TX_MAX_COUNT, tx_nr_packets_max - 1);
+ can->can.echo_skb_max = min(KVASER_PCIEFD_CAN_TX_MAX_COUNT, tx_nr_packets_max - 1);
can->can.clock.freq = pcie->freq;
- can->can.echo_skb_max = roundup_pow_of_two(can->tx_max_count);
spin_lock_init(&can->lock);
can->can.bittiming_const = &kvaser_pciefd_bittiming_const;
@@ -1523,7 +1521,7 @@ static int kvaser_pciefd_handle_ack_packet(struct kvaser_pciefd *pcie,
unsigned int len, frame_len = 0;
struct sk_buff *skb;
- if (echo_idx != (can->ack_idx & (can->can.echo_skb_max - 1)))
+ if (echo_idx != can->ack_idx % can->can.echo_skb_max)
return 0;
skb = can->can.echo_skb[echo_idx];
if (!skb)
--
2.49.0