From: Andrew Kanner andrew.kanner@gmail.com
[ Upstream commit cade5397e5461295f3cb87880534b6a07cafa427 ]
Syzkaller reported the following issue: ================================================================== BUG: KASAN: double-free in slab_free mm/slub.c:3787 [inline] BUG: KASAN: double-free in __kmem_cache_free+0x71/0x110 mm/slub.c:3800 Free of addr ffff888086408000 by task syz-executor.4/12750 [...] Call Trace: <TASK> [...] kasan_report_invalid_free+0xac/0xd0 mm/kasan/report.c:482 ____kasan_slab_free+0xfb/0x120 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1807 slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3800 dbUnmount+0xf4/0x110 fs/jfs/jfs_dmap.c:264 jfs_umount+0x248/0x3b0 fs/jfs/jfs_umount.c:87 jfs_put_super+0x86/0x190 fs/jfs/super.c:194 generic_shutdown_super+0x130/0x310 fs/super.c:492 kill_block_super+0x79/0xd0 fs/super.c:1386 deactivate_locked_super+0xa7/0xf0 fs/super.c:332 cleanup_mnt+0x494/0x520 fs/namespace.c:1291 task_work_run+0x243/0x300 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171 exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296 do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] </TASK>
Allocated by task 13352: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x3d/0x60 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:371 [inline] __kasan_kmalloc+0x97/0xb0 mm/kasan/common.c:380 kmalloc include/linux/slab.h:580 [inline] dbMount+0x54/0x980 fs/jfs/jfs_dmap.c:164 jfs_mount+0x1dd/0x830 fs/jfs/jfs_mount.c:121 jfs_fill_super+0x590/0xc50 fs/jfs/super.c:556 mount_bdev+0x26c/0x3a0 fs/super.c:1359 legacy_get_tree+0xea/0x180 fs/fs_context.c:610 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 13352: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x3d/0x60 mm/kasan/common.c:52 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:518 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1807 slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3800 dbUnmount+0xf4/0x110 fs/jfs/jfs_dmap.c:264 jfs_mount_rw+0x545/0x740 fs/jfs/jfs_mount.c:247 jfs_remount+0x3db/0x710 fs/jfs/super.c:454 reconfigure_super+0x3bc/0x7b0 fs/super.c:935 vfs_fsconfig_locked fs/fsopen.c:254 [inline] __do_sys_fsconfig fs/fsopen.c:439 [inline] __se_sys_fsconfig+0xad5/0x1060 fs/fsopen.c:314 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...]
JFS_SBI(ipbmap->i_sb)->bmap wasn't set to NULL after kfree() in dbUnmount().
Syzkaller uses faultinject to reproduce this KASAN double-free warning. The issue is triggered if either diMount() or dbMount() fail in jfs_remount(), since diUnmount() or dbUnmount() already happened in such a case - they will do double-free on next execution: jfs_umount or jfs_remount.
Tested on both upstream and jfs-next by syzkaller.
Reported-and-tested-by: syzbot+6a93efb725385bc4b2e9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000471f2d05f1ce8bad@google.com/T/ Link: https://syzkaller.appspot.com/bug?extid=6a93efb725385bc4b2e9 Signed-off-by: Andrew Kanner andrew.kanner@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index bd4ef43b02033..e9d075cbd71ad 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -269,6 +269,7 @@ int dbUnmount(struct inode *ipbmap, int mounterror)
/* free the memory for the in-memory bmap. */ kfree(bmp); + JFS_SBI(ipbmap->i_sb)->bmap = NULL;
return (0); }
From: Liu Shixin via Jfs-discussion jfs-discussion@lists.sourceforge.net
[ Upstream commit 6e2bda2c192d0244b5a78b787ef20aa10cb319b7 ]
syzbot found an invalid-free in diUnmount:
BUG: KASAN: double-free in slab_free mm/slub.c:3661 [inline] BUG: KASAN: double-free in __kmem_cache_free+0x71/0x110 mm/slub.c:3674 Free of addr ffff88806f410000 by task syz-executor131/3632
CPU: 0 PID: 3632 Comm: syz-executor131 Not tainted 6.1.0-rc7-syzkaller-00012-gca57f02295f1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:284 print_report+0x107/0x1f0 mm/kasan/report.c:395 kasan_report_invalid_free+0xac/0xd0 mm/kasan/report.c:460 ____kasan_slab_free+0xfb/0x120 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1724 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750 slab_free mm/slub.c:3661 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3674 diUnmount+0xef/0x100 fs/jfs/jfs_imap.c:195 jfs_umount+0x108/0x370 fs/jfs/jfs_umount.c:63 jfs_put_super+0x86/0x190 fs/jfs/super.c:194 generic_shutdown_super+0x130/0x310 fs/super.c:492 kill_block_super+0x79/0xd0 fs/super.c:1428 deactivate_locked_super+0xa7/0xf0 fs/super.c:332 cleanup_mnt+0x494/0x520 fs/namespace.c:1186 task_work_run+0x243/0x300 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x664/0x2070 kernel/exit.c:820 do_group_exit+0x1fd/0x2b0 kernel/exit.c:950 __do_sys_exit_group kernel/exit.c:961 [inline] __se_sys_exit_group kernel/exit.c:959 [inline] __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:959 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...]
JFS_IP(ipimap)->i_imap is not setting to NULL after free in diUnmount. If jfs_remount() free JFS_IP(ipimap)->i_imap but then failed at diMount(). JFS_IP(ipimap)->i_imap will be freed once again. Fix this problem by setting JFS_IP(ipimap)->i_imap to NULL after free.
Reported-by: syzbot+90a11e6b1e810785c6ff@syzkaller.appspotmail.com Signed-off-by: Liu Shixin liushixin2@huawei.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_imap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c index 390cbfce391fc..6fb28572cb2c6 100644 --- a/fs/jfs/jfs_imap.c +++ b/fs/jfs/jfs_imap.c @@ -193,6 +193,7 @@ int diUnmount(struct inode *ipimap, int mounterror) * free in-memory control structure */ kfree(imap); + JFS_IP(ipimap)->i_imap = NULL;
return (0); }
From: Niklas Schnelle schnelle@linux.ibm.com
[ Upstream commit f768c75d61582b011962f9dcb9ff8eafb8da0383 ]
In the future inw() and friends will not be compiled on architectures without I/O port support.
Co-developed-by: Arnd Bergmann arnd@kernel.org Link: https://lore.kernel.org/r/20230703135255.2202721-2-schnelle@linux.ibm.com Signed-off-by: Arnd Bergmann arnd@kernel.org Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b7c65193e786a..276464fd3b30b 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -268,6 +268,7 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_CBUS_2, quirk_isa_d DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NEC, PCI_DEVICE_ID_NEC_CBUS_3, quirk_isa_dma_hangs); #endif
+#ifdef CONFIG_HAS_IOPORT /* * Intel NM10 "TigerPoint" LPC PM1a_STS.BM_STS must be clear * for some HT machines to use C4 w/o hanging. @@ -287,6 +288,7 @@ static void quirk_tigerpoint_bm_sts(struct pci_dev *dev) } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_TGP_LPC, quirk_tigerpoint_bm_sts); +#endif
/* Chipsets where PCI->PCI transfers vanish or hang */ static void quirk_nopcipci(struct pci_dev *dev)
From: Baokun Li libaokun1@huawei.com
[ Upstream commit 43bbddc067883d94de7a43d5756a295439fbe37d ]
When we use lstart + len to calculate the end of free extent or prealloc space, it may exceed the maximum value of 4294967295(0xffffffff) supported by ext4_lblk_t and cause overflow, which may lead to various problems.
Therefore, we add two helper functions, extent_logical_end() and pa_logical_end(), to limit the type of end to loff_t, and also convert lstart to loff_t for calculation to avoid overflow.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Link: https://lore.kernel.org/r/20230724121059.11834-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 9 +++------ fs/ext4/mballoc.h | 14 ++++++++++++++ 2 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 3fa5de892d89d..f17c1573c2a65 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4246,7 +4246,7 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac,
/* first, let's learn actual file size * given current request is allocated */ - size = ac->ac_o_ex.fe_logical + EXT4_C2B(sbi, ac->ac_o_ex.fe_len); + size = extent_logical_end(sbi, &ac->ac_o_ex); size = size << bsbits; if (size < i_size_read(ac->ac_inode)) size = i_size_read(ac->ac_inode); @@ -4570,7 +4570,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_locality_group *lg; struct ext4_prealloc_space *tmp_pa = NULL, *cpa = NULL; - loff_t tmp_pa_end; struct rb_node *iter; ext4_fsblk_t goal_block;
@@ -4666,9 +4665,7 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac) * pa can possibly satisfy the request hence check if it overlaps * original logical start and stop searching if it doesn't. */ - tmp_pa_end = (loff_t)tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); - - if (ac->ac_o_ex.fe_logical >= tmp_pa_end) { + if (ac->ac_o_ex.fe_logical >= pa_logical_end(sbi, tmp_pa)) { spin_unlock(&tmp_pa->pa_lock); goto try_group_pa; } @@ -5573,7 +5570,7 @@ static void ext4_mb_group_or_file(struct ext4_allocation_context *ac)
group_pa_eligible = sbi->s_mb_group_prealloc > 0; inode_pa_eligible = true; - size = ac->ac_o_ex.fe_logical + EXT4_C2B(sbi, ac->ac_o_ex.fe_len); + size = extent_logical_end(sbi, &ac->ac_o_ex); isize = (i_size_read(ac->ac_inode) + ac->ac_sb->s_blocksize - 1) >> bsbits;
diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index 6d85ee8674a64..f2b1d4fd98693 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -219,6 +219,20 @@ static inline ext4_fsblk_t ext4_grp_offs_to_block(struct super_block *sb, (fex->fe_start << EXT4_SB(sb)->s_cluster_bits); }
+static inline loff_t extent_logical_end(struct ext4_sb_info *sbi, + struct ext4_free_extent *fex) +{ + /* Use loff_t to avoid end exceeding ext4_lblk_t max. */ + return (loff_t)fex->fe_logical + EXT4_C2B(sbi, fex->fe_len); +} + +static inline loff_t pa_logical_end(struct ext4_sb_info *sbi, + struct ext4_prealloc_space *pa) +{ + /* Use loff_t to avoid end exceeding ext4_lblk_t max. */ + return (loff_t)pa->pa_lstart + EXT4_C2B(sbi, pa->pa_len); +} + typedef int (*ext4_mballoc_query_range_fn)( struct super_block *sb, ext4_group_t agno,
From: Baokun Li libaokun1@huawei.com
[ Upstream commit bedc5d34632c21b5adb8ca7143d4c1f794507e4c ]
Let's say we want to allocate 2 blocks starting from 4294966386, after predicting the file size, start is aligned to 4294965248, len is changed to 2048, then end = start + size = 0x100000000. Since end is of type ext4_lblk_t, i.e. uint, end is truncated to 0.
This causes (pa->pa_lstart >= end) to always hold when checking if the current extent to be allocated crosses already preallocated blocks, so the resulting ac_g_ex may cross already preallocated blocks. Hence we convert the end type to loff_t and use pa_logical_end() to avoid overflow.
Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Link: https://lore.kernel.org/r/20230724121059.11834-4-libaokun1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/mballoc.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index f17c1573c2a65..6e88f58ec6bb8 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4036,12 +4036,13 @@ ext4_mb_pa_rb_next_iter(ext4_lblk_t new_start, ext4_lblk_t cur_start, struct rb_
static inline void ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, - ext4_lblk_t start, ext4_lblk_t end) + ext4_lblk_t start, loff_t end) { struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_prealloc_space *tmp_pa; - ext4_lblk_t tmp_pa_start, tmp_pa_end; + ext4_lblk_t tmp_pa_start; + loff_t tmp_pa_end; struct rb_node *iter;
read_lock(&ei->i_prealloc_lock); @@ -4050,7 +4051,7 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, tmp_pa = rb_entry(iter, struct ext4_prealloc_space, pa_node.inode_node); tmp_pa_start = tmp_pa->pa_lstart; - tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + tmp_pa_end = pa_logical_end(sbi, tmp_pa);
spin_lock(&tmp_pa->pa_lock); if (tmp_pa->pa_deleted == 0) @@ -4072,14 +4073,14 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, */ static inline void ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, - ext4_lblk_t *start, ext4_lblk_t *end) + ext4_lblk_t *start, loff_t *end) { struct ext4_inode_info *ei = EXT4_I(ac->ac_inode); struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_prealloc_space *tmp_pa = NULL, *left_pa = NULL, *right_pa = NULL; struct rb_node *iter; - ext4_lblk_t new_start, new_end; - ext4_lblk_t tmp_pa_start, tmp_pa_end, left_pa_end = -1, right_pa_start = -1; + ext4_lblk_t new_start, tmp_pa_start, right_pa_start = -1; + loff_t new_end, tmp_pa_end, left_pa_end = -1;
new_start = *start; new_end = *end; @@ -4098,7 +4099,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, tmp_pa = rb_entry(iter, struct ext4_prealloc_space, pa_node.inode_node); tmp_pa_start = tmp_pa->pa_lstart; - tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + tmp_pa_end = pa_logical_end(sbi, tmp_pa);
/* PA must not overlap original request */ spin_lock(&tmp_pa->pa_lock); @@ -4178,8 +4179,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, }
if (left_pa) { - left_pa_end = - left_pa->pa_lstart + EXT4_C2B(sbi, left_pa->pa_len); + left_pa_end = pa_logical_end(sbi, left_pa); BUG_ON(left_pa_end > ac->ac_o_ex.fe_logical); }
@@ -4218,8 +4218,7 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac, struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb); struct ext4_super_block *es = sbi->s_es; int bsbits, max; - ext4_lblk_t end; - loff_t size, start_off; + loff_t size, start_off, end; loff_t orig_size __maybe_unused; ext4_lblk_t start;
From: Mark Brown broonie@kernel.org
[ Upstream commit fc8b24c28bec19fc0621d108b9ee81ddfdedb25a ]
The i.MX integration for the DesignWare PCI controller has a _host_exit() operation which undoes everything that the _host_init() operation does but does not wire this up as the host_deinit callback for the core, or call it in any path other than suspend. This means that if we ever unwind the initial probe of the device, for example because it fails, the regulator core complains that the regulators for the device were left enabled:
imx6q-pcie 33800000.pcie: iATU: unroll T, 4 ob, 4 ib, align 64K, limit 16G imx6q-pcie 33800000.pcie: Phy link never came up imx6q-pcie 33800000.pcie: Phy link never came up imx6q-pcie: probe of 33800000.pcie failed with error -110 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 46 at drivers/regulator/core.c:2396 _regulator_put+0x110/0x128
Wire up the callback so that the core can clean up after itself.
Link: https://lore.kernel.org/r/20230731-pci-imx-regulator-cleanup-v2-1-fc8fa5c989... Tested-by: Fabio Estevam festevam@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Richard Zhu hongxing.zhu@nxp.com Acked-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pci-imx6.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c index 52906f999f2bb..e4942bd2598d5 100644 --- a/drivers/pci/controller/dwc/pci-imx6.c +++ b/drivers/pci/controller/dwc/pci-imx6.c @@ -1039,6 +1039,7 @@ static void imx6_pcie_host_exit(struct dw_pcie_rp *pp)
static const struct dw_pcie_host_ops imx6_pcie_host_ops = { .host_init = imx6_pcie_host_init, + .host_deinit = imx6_pcie_host_exit, };
static const struct dw_pcie_ops dw_pcie_ops = {
From: Tomislav Novak tnovak@fb.com
[ Upstream commit e6b51532d5273eeefba84106daea3d392c602837 ]
Arm platforms use is_default_overflow_handler() to determine if the hw_breakpoint code should single-step over the breakpoint trigger or let the custom handler deal with it.
Since bpf_overflow_handler() currently isn't recognized as a default handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes it to keep firing (the instruction triggering the data abort exception is never skipped). For example:
# bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test Attaching 1 probe... hit hit [...] ^C
(./test performs a single 4-byte store to 0x10000)
This patch replaces the check with uses_default_overflow_handler(), which accounts for the bpf_overflow_handler() case by also testing if one of the perf_event_output functions gets invoked indirectly, via orig_default_handler.
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb....
Signed-off-by: Tomislav Novak tnovak@fb.com Tested-by: Samuel Gosselin sgosselin@google.com # arm64 Reviewed-by: Catalin Marinas catalin.marinas@arm.com Acked-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/kernel/hw_breakpoint.c | 8 ++++---- arch/arm64/kernel/hw_breakpoint.c | 4 ++-- include/linux/perf_event.h | 22 +++++++++++++++++++--- 3 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index 054e9199f30db..dc0fb7a813715 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -626,7 +626,7 @@ int hw_breakpoint_arch_parse(struct perf_event *bp, hw->address &= ~alignment_mask; hw->ctrl.len <<= offset;
- if (is_default_overflow_handler(bp)) { + if (uses_default_overflow_handler(bp)) { /* * Mismatch breakpoints are required for single-stepping * breakpoints. @@ -798,7 +798,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, * Otherwise, insert a temporary mismatch breakpoint so that * we can single-step over the watchpoint trigger. */ - if (!is_default_overflow_handler(wp)) + if (!uses_default_overflow_handler(wp)) continue; step: enable_single_step(wp, instruction_pointer(regs)); @@ -811,7 +811,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, info->trigger = addr; pr_debug("watchpoint fired: address = 0x%x\n", info->trigger); perf_bp_event(wp, regs); - if (is_default_overflow_handler(wp)) + if (uses_default_overflow_handler(wp)) enable_single_step(wp, instruction_pointer(regs)); }
@@ -886,7 +886,7 @@ static void breakpoint_handler(unsigned long unknown, struct pt_regs *regs) info->trigger = addr; pr_debug("breakpoint fired: address = 0x%x\n", addr); perf_bp_event(bp, regs); - if (is_default_overflow_handler(bp)) + if (uses_default_overflow_handler(bp)) enable_single_step(bp, addr); goto unlock; } diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index b29a311bb0552..9659a9555c63a 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -654,7 +654,7 @@ static int breakpoint_handler(unsigned long unused, unsigned long esr, perf_bp_event(bp, regs);
/* Do we need to handle the stepping? */ - if (is_default_overflow_handler(bp)) + if (uses_default_overflow_handler(bp)) step = 1; unlock: rcu_read_unlock(); @@ -733,7 +733,7 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val, static int watchpoint_report(struct perf_event *wp, unsigned long addr, struct pt_regs *regs) { - int step = is_default_overflow_handler(wp); + int step = uses_default_overflow_handler(wp); struct arch_hw_breakpoint *info = counter_arch_bp(wp);
info->trigger = addr; diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index c8dcfdbda1f40..5783fd921cc45 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1305,15 +1305,31 @@ extern int perf_event_output(struct perf_event *event, struct pt_regs *regs);
static inline bool -is_default_overflow_handler(struct perf_event *event) +__is_default_overflow_handler(perf_overflow_handler_t overflow_handler) { - if (likely(event->overflow_handler == perf_event_output_forward)) + if (likely(overflow_handler == perf_event_output_forward)) return true; - if (unlikely(event->overflow_handler == perf_event_output_backward)) + if (unlikely(overflow_handler == perf_event_output_backward)) return true; return false; }
+#define is_default_overflow_handler(event) \ + __is_default_overflow_handler((event)->overflow_handler) + +#ifdef CONFIG_BPF_SYSCALL +static inline bool uses_default_overflow_handler(struct perf_event *event) +{ + if (likely(is_default_overflow_handler(event))) + return true; + + return __is_default_overflow_handler(event->orig_overflow_handler); +} +#else +#define uses_default_overflow_handler(event) \ + is_default_overflow_handler(event) +#endif + extern void perf_event_header__init_id(struct perf_event_header *header, struct perf_sample_data *data,
From: Mårten Lindahl marten.lindahl@axis.com
[ Upstream commit 8922ba71c969d2a0c01a94372a71477d879470de ]
If a panic is triggered by a hrtimer interrupt all online cpus will be notified and set offline. But as highlighted by commit 19dbdcb8039c ("smp: Warn on function calls from softirq context") this call should not be made synchronous with disabled interrupts:
softdog: Initiating panic Kernel panic - not syncing: Software Watchdog Timer expired WARNING: CPU: 1 PID: 0 at kernel/smp.c:753 smp_call_function_many_cond unwind_backtrace: show_stack dump_stack_lvl __warn warn_slowpath_fmt smp_call_function_many_cond smp_call_function crash_smp_send_stop.part.0 machine_crash_shutdown __crash_kexec panic softdog_fire __hrtimer_run_queues hrtimer_interrupt
Make the smp call for machine_crash_nonpanic_core() asynchronous.
Signed-off-by: Mårten Lindahl marten.lindahl@axis.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/kernel/machine_kexec.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index 46364b699cc30..5d07cf9e0044d 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -94,16 +94,28 @@ static void machine_crash_nonpanic_core(void *unused) } }
+static DEFINE_PER_CPU(call_single_data_t, cpu_stop_csd) = + CSD_INIT(machine_crash_nonpanic_core, NULL); + void crash_smp_send_stop(void) { static int cpus_stopped; unsigned long msecs; + call_single_data_t *csd; + int cpu, this_cpu = raw_smp_processor_id();
if (cpus_stopped) return;
atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); - smp_call_function(machine_crash_nonpanic_core, NULL, false); + for_each_online_cpu(cpu) { + if (cpu == this_cpu) + continue; + + csd = &per_cpu(cpu_stop_csd, cpu); + smp_call_function_single_async(cpu, csd); + } + msecs = 1000; /* Wait at most a second for the other cpus to stop */ while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { mdelay(1);
From: ruanjinjie ruanjinjie@huawei.com
[ Upstream commit afda85b963c12947e298ad85d757e333aa40fd74 ]
If device_register() returns error in ibmebus_bus_init(), name of kobject which is allocated in dev_set_name() called in device_add() is leaked.
As comment of device_add() says, it should call put_device() to drop the reference count that was set in device_initialize() when it fails, so the name can be freed in kobject_cleanup().
Signed-off-by: ruanjinjie ruanjinjie@huawei.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20221110011929.3709774-1-ruanjinjie@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/ibmebus.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/platforms/pseries/ibmebus.c b/arch/powerpc/platforms/pseries/ibmebus.c index 44703f13985bf..969cb9fc960f8 100644 --- a/arch/powerpc/platforms/pseries/ibmebus.c +++ b/arch/powerpc/platforms/pseries/ibmebus.c @@ -460,6 +460,7 @@ static int __init ibmebus_bus_init(void) if (err) { printk(KERN_WARNING "%s: device_register returned %i\n", __func__, err); + put_device(&ibmebus_bus_device); bus_unregister(&ibmebus_bus_type);
return err;
From: Nirmal Patel nirmal.patel@linux.intel.com
[ Upstream commit f73eedc90bf73d48e8368e6b0b4ad76a7fffaef7 ]
During domain reset process vmd_domain_reset() clears PCI configuration space of VMD root ports. But certain platform has observed following errors and failed to boot. ... DMAR: VT-d detected Invalidation Queue Error: Reason f DMAR: VT-d detected Invalidation Time-out Error: SID ffff DMAR: VT-d detected Invalidation Completion Error: SID ffff DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: Invalidation Time-out Error (ITE) cleared
The root cause is that memset_io() clears prefetchable memory base/limit registers and prefetchable base/limit 32 bits registers sequentially. This seems to be enabling prefetchable memory if the device disabled prefetchable memory originally.
Here is an example (before memset_io()):
PCI configuration space for 10000:00:00.0: 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 ...
So, prefetchable memory is ffffffff00000000-575000fffff, which is disabled. When memset_io() clears prefetchable base 32 bits register, the prefetchable memory becomes 0000000000000000-575000fffff, which is enabled and incorrect.
Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec:
The Prefetchable Memory Limit register must be programmed to a smaller value than the Prefetchable Memory Base register if there is no prefetchable memory on the secondary side of the bridge.
This is believed to be the reason for the failure and in addition the sequence of operation in vmd_domain_reset() is not following the PCIe specs.
Disable the bridge window by executing a sequence of operations borrowed from pci_disable_bridge_window() and pci_setup_bridge_io(), that comply with the PCI specifications.
Link: https://lore.kernel.org/r/20230810215029.1177379-1-nirmal.patel@linux.intel.... Signed-off-by: Nirmal Patel nirmal.patel@linux.intel.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/vmd.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index e718a816d4814..ad56df98b8e63 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -541,8 +541,23 @@ static void vmd_domain_reset(struct vmd_dev *vmd) PCI_CLASS_BRIDGE_PCI)) continue;
- memset_io(base + PCI_IO_BASE, 0, - PCI_ROM_ADDRESS1 - PCI_IO_BASE); + /* + * Temporarily disable the I/O range before updating + * PCI_IO_BASE. + */ + writel(0x0000ffff, base + PCI_IO_BASE_UPPER16); + /* Update lower 16 bits of I/O base/limit */ + writew(0x00f0, base + PCI_IO_BASE); + /* Update upper 16 bits of I/O base/limit */ + writel(0, base + PCI_IO_BASE_UPPER16); + + /* MMIO Base/Limit */ + writel(0x0000fff0, base + PCI_MEMORY_BASE); + + /* Prefetchable MMIO Base/Limit */ + writel(0, base + PCI_PREF_LIMIT_UPPER32); + writel(0x0000fff0, base + PCI_PREF_MEMORY_BASE); + writel(0xffffffff, base + PCI_PREF_BASE_UPPER32); } } }
From: Yong-Xuan Wang yongxuan.wang@sifive.com
[ Upstream commit 551a60e1225e71fff8efd9390204c505b0870e0f ]
The iMSI-RX module of the DW PCIe controller provides multiple sets of MSI_CTRL_INT_i_* registers, and each set is capable of handling 32 MSI interrupts. However, the fu740 PCIe controller driver only enabled one set of MSI_CTRL_INT_i_* registers, as the total number of supported interrupts was not specified.
Set the supported number of MSI vectors to enable all the MSI_CTRL_INT_i_* registers on the fu740 PCIe core, allowing the system to fully utilize the available MSI interrupts.
Link: https://lore.kernel.org/r/20230807055621.2431-1-yongxuan.wang@sifive.com Signed-off-by: Yong-Xuan Wang yongxuan.wang@sifive.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-fu740.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/dwc/pcie-fu740.c b/drivers/pci/controller/dwc/pcie-fu740.c index 0c90583c078bf..1e9b44b8bba48 100644 --- a/drivers/pci/controller/dwc/pcie-fu740.c +++ b/drivers/pci/controller/dwc/pcie-fu740.c @@ -299,6 +299,7 @@ static int fu740_pcie_probe(struct platform_device *pdev) pci->dev = dev; pci->ops = &dw_pcie_ops; pci->pp.ops = &fu740_pcie_host_ops; + pci->pp.num_vectors = MAX_MSI_IRQS;
/* SiFive specific region: mgmt */ afp->mgmt_base = devm_platform_ioremap_resource_byname(pdev, "mgmt");
linux-stable-mirror@lists.linaro.org