From: Jeimon jjjinmeng.zhou@gmail.com
[ Upstream commit 8ab78863e9eff11910e1ac8bcf478060c29b379e ]
The function rawsock_create() calls a privileged function sk_alloc(), which requires a ns-aware check to check net->user_ns, i.e., ns_capable(). However, the original code checks the init_user_ns using capable(). So we replace the capable() with ns_capable().
Signed-off-by: Jeimon jjjinmeng.zhou@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/nfc/rawsock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c index 92a3cfae4de8..2fba626a0125 100644 --- a/net/nfc/rawsock.c +++ b/net/nfc/rawsock.c @@ -345,7 +345,7 @@ static int rawsock_create(struct net *net, struct socket *sock, return -ESOCKTNOSUPPORT;
if (sock->type == SOCK_RAW) { - if (!capable(CAP_NET_RAW)) + if (!ns_capable(net->user_ns, CAP_NET_RAW)) return -EPERM; sock->ops = &rawsock_raw_ops; } else {
From: Zou Wei zou_wei@huawei.com
[ Upstream commit e072b2671606c77538d6a4dd5dda80b508cb4816 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Link: https://lore.kernel.org/r/1620789145-14936-1-git-send-email-zou_wei@huawei.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/sti-sas.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/codecs/sti-sas.c b/sound/soc/codecs/sti-sas.c index d6e00c77edcd..7cf76661c3cc 100644 --- a/sound/soc/codecs/sti-sas.c +++ b/sound/soc/codecs/sti-sas.c @@ -542,6 +542,7 @@ static const struct of_device_id sti_sas_dev_match[] = { }, {}, }; +MODULE_DEVICE_TABLE(of, sti_sas_dev_match);
static int sti_sas_driver_probe(struct platform_device *pdev) {
From: Zheyu Ma zheyuma97@gmail.com
[ Upstream commit 9f6f852550d0e1b7735651228116ae9d300f69b3 ]
'nj_setup' in netjet.c might fail with -EIO and in this case 'card->irq' is initialized and is bigger than zero. A subsequent call to 'nj_release' will free the irq that has not been requested.
Fix this bug by deleting the previous assignment to 'card->irq' and just keep the assignment before 'request_irq'.
The KASAN's log reveals it:
[ 3.354615 ] WARNING: CPU: 0 PID: 1 at kernel/irq/manage.c:1826 free_irq+0x100/0x480 [ 3.355112 ] Modules linked in: [ 3.355310 ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00144-g25a1298726e #13 [ 3.355816 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.356552 ] RIP: 0010:free_irq+0x100/0x480 [ 3.356820 ] Code: 6e 08 74 6f 4d 89 f4 e8 5e ac 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 4f ac 09 00 8b 75 c8 48 c7 c7 78 c1 2e 85 e8 e0 cf f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 72 33 0b 03 48 8b 43 40 4c 8b a0 80 [ 3.358012 ] RSP: 0000:ffffc90000017b48 EFLAGS: 00010082 [ 3.358357 ] RAX: 0000000000000000 RBX: ffff888104dc8000 RCX: 0000000000000000 [ 3.358814 ] RDX: ffff8881003c8000 RSI: ffffffff8124a9e6 RDI: 00000000ffffffff [ 3.359272 ] RBP: ffffc90000017b88 R08: 0000000000000000 R09: 0000000000000000 [ 3.359732 ] R10: ffffc900000179f0 R11: 0000000000001d04 R12: 0000000000000000 [ 3.360195 ] R13: ffff888107dc6000 R14: ffff888107dc6928 R15: ffff888104dc80a8 [ 3.360652 ] FS: 0000000000000000(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000 [ 3.361170 ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.361538 ] CR2: 0000000000000000 CR3: 000000000582e000 CR4: 00000000000006f0 [ 3.362003 ] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.362175 ] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.362175 ] Call Trace: [ 3.362175 ] nj_release+0x51/0x1e0 [ 3.362175 ] nj_probe+0x450/0x950 [ 3.362175 ] ? pci_device_remove+0x110/0x110 [ 3.362175 ] local_pci_probe+0x45/0xa0 [ 3.362175 ] pci_device_probe+0x12b/0x1d0 [ 3.362175 ] really_probe+0x2a9/0x610 [ 3.362175 ] driver_probe_device+0x90/0x1d0 [ 3.362175 ] ? mutex_lock_nested+0x1b/0x20 [ 3.362175 ] device_driver_attach+0x68/0x70 [ 3.362175 ] __driver_attach+0x124/0x1b0 [ 3.362175 ] ? device_driver_attach+0x70/0x70 [ 3.362175 ] bus_for_each_dev+0xbb/0x110 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] driver_attach+0x27/0x30 [ 3.362175 ] bus_add_driver+0x1eb/0x2a0 [ 3.362175 ] driver_register+0xa9/0x180 [ 3.362175 ] __pci_register_driver+0x82/0x90 [ 3.362175 ] ? w6692_init+0x38/0x38 [ 3.362175 ] nj_init+0x36/0x38 [ 3.362175 ] do_one_initcall+0x7f/0x3d0 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] ? rcu_read_lock_sched_held+0x4f/0x80 [ 3.362175 ] kernel_init_freeable+0x2aa/0x301 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] kernel_init+0x18/0x190 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ret_from_fork+0x1f/0x30 [ 3.362175 ] Kernel panic - not syncing: panic_on_warn set ... [ 3.362175 ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00144-g25a1298726e #13 [ 3.362175 ] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.362175 ] Call Trace: [ 3.362175 ] dump_stack+0xba/0xf5 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] panic+0x15a/0x3f2 [ 3.362175 ] ? __warn+0xf2/0x150 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] __warn+0x108/0x150 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] report_bug+0x119/0x1c0 [ 3.362175 ] handle_bug+0x3b/0x80 [ 3.362175 ] exc_invalid_op+0x18/0x70 [ 3.362175 ] asm_exc_invalid_op+0x12/0x20 [ 3.362175 ] RIP: 0010:free_irq+0x100/0x480 [ 3.362175 ] Code: 6e 08 74 6f 4d 89 f4 e8 5e ac 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 4f ac 09 00 8b 75 c8 48 c7 c7 78 c1 2e 85 e8 e0 cf f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 72 33 0b 03 48 8b 43 40 4c 8b a0 80 [ 3.362175 ] RSP: 0000:ffffc90000017b48 EFLAGS: 00010082 [ 3.362175 ] RAX: 0000000000000000 RBX: ffff888104dc8000 RCX: 0000000000000000 [ 3.362175 ] RDX: ffff8881003c8000 RSI: ffffffff8124a9e6 RDI: 00000000ffffffff [ 3.362175 ] RBP: ffffc90000017b88 R08: 0000000000000000 R09: 0000000000000000 [ 3.362175 ] R10: ffffc900000179f0 R11: 0000000000001d04 R12: 0000000000000000 [ 3.362175 ] R13: ffff888107dc6000 R14: ffff888107dc6928 R15: ffff888104dc80a8 [ 3.362175 ] ? vprintk+0x76/0x150 [ 3.362175 ] ? free_irq+0x100/0x480 [ 3.362175 ] nj_release+0x51/0x1e0 [ 3.362175 ] nj_probe+0x450/0x950 [ 3.362175 ] ? pci_device_remove+0x110/0x110 [ 3.362175 ] local_pci_probe+0x45/0xa0 [ 3.362175 ] pci_device_probe+0x12b/0x1d0 [ 3.362175 ] really_probe+0x2a9/0x610 [ 3.362175 ] driver_probe_device+0x90/0x1d0 [ 3.362175 ] ? mutex_lock_nested+0x1b/0x20 [ 3.362175 ] device_driver_attach+0x68/0x70 [ 3.362175 ] __driver_attach+0x124/0x1b0 [ 3.362175 ] ? device_driver_attach+0x70/0x70 [ 3.362175 ] bus_for_each_dev+0xbb/0x110 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] driver_attach+0x27/0x30 [ 3.362175 ] bus_add_driver+0x1eb/0x2a0 [ 3.362175 ] driver_register+0xa9/0x180 [ 3.362175 ] __pci_register_driver+0x82/0x90 [ 3.362175 ] ? w6692_init+0x38/0x38 [ 3.362175 ] nj_init+0x36/0x38 [ 3.362175 ] do_one_initcall+0x7f/0x3d0 [ 3.362175 ] ? rdinit_setup+0x45/0x45 [ 3.362175 ] ? rcu_read_lock_sched_held+0x4f/0x80 [ 3.362175 ] kernel_init_freeable+0x2aa/0x301 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] kernel_init+0x18/0x190 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ? rest_init+0x2c0/0x2c0 [ 3.362175 ] ret_from_fork+0x1f/0x30 [ 3.362175 ] Dumping ftrace buffer: [ 3.362175 ] (ftrace buffer empty) [ 3.362175 ] Kernel Offset: disabled [ 3.362175 ] Rebooting in 1 seconds..
Reported-by: Zheyu Ma zheyuma97@gmail.com Signed-off-by: Zheyu Ma zheyuma97@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/isdn/hardware/mISDN/netjet.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/isdn/hardware/mISDN/netjet.c b/drivers/isdn/hardware/mISDN/netjet.c index afde4edef9ae..6dea4c180c49 100644 --- a/drivers/isdn/hardware/mISDN/netjet.c +++ b/drivers/isdn/hardware/mISDN/netjet.c @@ -1114,7 +1114,6 @@ nj_probe(struct pci_dev *pdev, const struct pci_device_id *ent) card->typ = NETJET_S_TJ300;
card->base = pci_resource_start(pdev, 0); - card->irq = pdev->irq; pci_set_drvdata(pdev, card); err = setup_instance(card); if (err)
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 35d96e631860226d5dc4de0fad0a415362ec2457 ]
If bond_kobj_init() or later kzalloc() in bond_alloc_slave() fail, then we call kobject_put() on the slave->kobj. This in turn calls the release function slave_kobj_release() which will always try to cancel_delayed_work_sync(&slave->notify_work), which shouldn't be done on an uninitialized work struct.
Always initialize the work struct earlier to avoid problems here.
Syzbot bisected this down to a completely pointless commit, some fault injection may have been at work here that caused the alloc failure in the first place, which may interact badly with bisect.
Reported-by: syzbot+bfda097c12a00c8cae67@syzkaller.appspotmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Acked-by: Jay Vosburgh jay.vosburgh@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 16437aa35bc4..2b721ed392ad 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1280,6 +1280,7 @@ static struct slave *bond_alloc_slave(struct bonding *bond,
slave->bond = bond; slave->dev = slave_dev; + INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work);
if (bond_kobj_init(slave)) return NULL; @@ -1292,7 +1293,6 @@ static struct slave *bond_alloc_slave(struct bonding *bond, return NULL; } } - INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work);
return slave; }
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 1d482e666b8e74c7555dbdfbfb77205eeed3ff2d ]
Syzbot reports that in mac80211 we have a potential deadlock between our "local->stop_queue_reasons_lock" (spinlock) and netlink's nl_table_lock (rwlock). This is because there's at least one situation in which we might try to send a netlink message with this spinlock held while it is also possible to take the spinlock from a hardirq context, resulting in the following deadlock scenario reported by lockdep:
CPU0 CPU1 ---- ---- lock(nl_table_lock); local_irq_disable(); lock(&local->queue_stop_reason_lock); lock(nl_table_lock); <Interrupt> lock(&local->queue_stop_reason_lock);
This seems valid, we can take the queue_stop_reason_lock in any kind of context ("CPU0"), and call ieee80211_report_ack_skb() with the spinlock held and IRQs disabled ("CPU1") in some code path (ieee80211_do_stop() via ieee80211_free_txskb()).
Short of disallowing netlink use in scenarios like these (which would be rather complex in mac80211's case due to the deep callchain), it seems the only fix for this is to disable IRQs while nl_table_lock is held to avoid hitting this scenario, this disallows the "CPU0" portion of the reported deadlock.
Note that the writer side (netlink_table_grab()) already disables IRQs for this lock.
Unfortunately though, this seems like a huge hammer, and maybe the whole netlink table locking should be reworked.
Reported-by: syzbot+69ff9dff50dcfe14ddd4@syzkaller.appspotmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/netlink/af_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 205865292ba3..541410f1c3b7 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -436,11 +436,13 @@ void netlink_table_ungrab(void) static inline void netlink_lock_table(void) { + unsigned long flags; + /* read_lock() synchronizes us to netlink_table_grab */
- read_lock(&nl_table_lock); + read_lock_irqsave(&nl_table_lock, flags); atomic_inc(&nl_table_users); - read_unlock(&nl_table_lock); + read_unlock_irqrestore(&nl_table_lock, flags); }
static inline void
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 1dde47a66d4fb181830d6fa000e5ea86907b639e ]
We spotted a bug recently during a review where a driver was unregistering a bus that wasn't registered, which would trigger this BUG_ON(). Let's handle that situation more gracefully, and just print a warning and return.
Reported-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/mdio_bus.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index a9bbdcec0bad..8cc7563ab103 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -362,7 +362,8 @@ void mdiobus_unregister(struct mii_bus *bus) struct mdio_device *mdiodev; int i;
- BUG_ON(bus->state != MDIOBUS_REGISTERED); + if (WARN_ON_ONCE(bus->state != MDIOBUS_REGISTERED)) + return; bus->state = MDIOBUS_UNREGISTERED;
for (i = 0; i < PHY_MAX_ADDR; i++) {
From: Shakeel Butt shakeelb@google.com
[ Upstream commit 45e1ba40837ac2f6f4d4716bddb8d44bd7e4a251 ]
This patch effectively reverts the commit a3e72739b7a7 ("cgroup: fix too early usage of static_branch_disable()"). The commit 6041186a3258 ("init: initialize jump labels before command line option parsing") has moved the jump_label_init() before parse_args() which has made the commit a3e72739b7a7 unnecessary. On the other hand there are consequences of disabling the controllers later as there are subsystems doing the controller checks for different decisions. One such incident is reported [1] regarding the memory controller and its impact on memory reclaim code.
[1] https://lore.kernel.org/linux-mm/921e53f3-4b13-aab8-4a9e-e83ff15371e4@nec.co...
Signed-off-by: Shakeel Butt shakeelb@google.com Reported-by: NOMURA JUNICHI(野村 淳一) junichi.nomura@nec.com Signed-off-by: Tejun Heo tj@kernel.org Tested-by: Jun'ichi Nomura junichi.nomura@nec.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/cgroup.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 684d02f343b4..3e0fca894a8b 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -5636,8 +5636,6 @@ int __init cgroup_init_early(void) return 0; }
-static u16 cgroup_disable_mask __initdata; - /** * cgroup_init - cgroup initialization * @@ -5695,12 +5693,8 @@ int __init cgroup_init(void) * disabled flag and cftype registration needs kmalloc, * both of which aren't available during early_init. */ - if (cgroup_disable_mask & (1 << ssid)) { - static_branch_disable(cgroup_subsys_enabled_key[ssid]); - printk(KERN_INFO "Disabling %s control group subsystem\n", - ss->name); + if (!cgroup_ssid_enabled(ssid)) continue; - }
if (cgroup_ssid_no_v1(ssid)) printk(KERN_INFO "Disabling %s control group subsystem in v1 mounts\n", @@ -6143,7 +6137,10 @@ static int __init cgroup_disable(char *str) if (strcmp(token, ss->name) && strcmp(token, ss->legacy_name)) continue; - cgroup_disable_mask |= 1 << i; + + static_branch_disable(cgroup_subsys_enabled_key[i]); + pr_info("Disabling %s control group subsystem\n", + ss->name); } } return 1;
From: Sergey Senozhatsky senozhatsky@chromium.org
[ Upstream commit 940d71c6462e8151c78f28e4919aa8882ff2054e ]
If VCPU is suspended (VM suspend) in wq_watchdog_timer_fn() then once this VCPU resumes it will see the new jiffies value, while it may take a while before IRQ detects PVCLOCK_GUEST_STOPPED on this VCPU and updates all the watchdogs via pvclock_touch_watchdogs(). There is a small chance of misreported WQ stalls in the meantime, because new jiffies is time_after() old 'ts + thresh'.
wq_watchdog_timer_fn() { for_each_pool(pool, pi) { if (time_after(jiffies, ts + thresh)) { pr_emerg("BUG: workqueue lockup - pool"); } } }
Save jiffies at the beginning of this function and use that value for stall detection. If VM gets suspended then we continue using "old" jiffies value and old WQ touch timestamps. If IRQ at some point restarts the stall detection cycle (pvclock_touch_watchdogs()) then old jiffies will always be before new 'ts + thresh'.
Signed-off-by: Sergey Senozhatsky senozhatsky@chromium.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 3231088afd73..a410d5827a73 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -49,6 +49,7 @@ #include <linux/moduleparam.h> #include <linux/uaccess.h> #include <linux/nmi.h> +#include <linux/kvm_para.h>
#include "workqueue_internal.h"
@@ -5387,6 +5388,7 @@ static void wq_watchdog_timer_fn(unsigned long data) { unsigned long thresh = READ_ONCE(wq_watchdog_thresh) * HZ; bool lockup_detected = false; + unsigned long now = jiffies; struct worker_pool *pool; int pi;
@@ -5401,6 +5403,12 @@ static void wq_watchdog_timer_fn(unsigned long data) if (list_empty(&pool->worklist)) continue;
+ /* + * If a virtual machine is stopped by the host it can look to + * the watchdog like a stall. + */ + kvm_check_and_clear_guest_paused(); + /* get the latest of pool and touched timestamps */ pool_ts = READ_ONCE(pool->watchdog_ts); touched = READ_ONCE(wq_watchdog_touched); @@ -5419,12 +5427,12 @@ static void wq_watchdog_timer_fn(unsigned long data) }
/* did we stall? */ - if (time_after(jiffies, ts + thresh)) { + if (time_after(now, ts + thresh)) { lockup_detected = true; pr_emerg("BUG: workqueue lockup - pool"); pr_cont_pool_info(pool); pr_cont(" stuck for %us!\n", - jiffies_to_msecs(jiffies - pool_ts) / 1000); + jiffies_to_msecs(now - pool_ts) / 1000); } }
From: Zheyu Ma zheyuma97@gmail.com
[ Upstream commit 13a6f3153922391e90036ba2267d34eed63196fc ]
When calling the 'ql_sem_spinlock', the driver has already acquired the spin lock, so the driver should not call 'ssleep' in atomic context.
This bug can be fixed by using 'mdelay' instead of 'ssleep'.
The KASAN's log reveals it:
[ 3.238124 ] BUG: scheduling while atomic: swapper/0/1/0x00000002 [ 3.238748 ] 2 locks held by swapper/0/1: [ 3.239151 ] #0: ffff88810177b240 (&dev->mutex){....}-{3:3}, at: __device_driver_lock+0x41/0x60 [ 3.240026 ] #1: ffff888107c60e28 (&qdev->hw_lock){....}-{2:2}, at: ql3xxx_probe+0x2aa/0xea0 [ 3.240873 ] Modules linked in: [ 3.241187 ] irq event stamp: 460854 [ 3.241541 ] hardirqs last enabled at (460853): [<ffffffff843051bf>] _raw_spin_unlock_irqrestore+0x4f/0x70 [ 3.242245 ] hardirqs last disabled at (460854): [<ffffffff843058ca>] _raw_spin_lock_irqsave+0x2a/0x70 [ 3.242245 ] softirqs last enabled at (446076): [<ffffffff846002e4>] __do_softirq+0x2e4/0x4b1 [ 3.242245 ] softirqs last disabled at (446069): [<ffffffff811ba5e0>] irq_exit_rcu+0x100/0x110 [ 3.242245 ] Preemption disabled at: [ 3.242245 ] [<ffffffff828ca5ba>] ql3xxx_probe+0x2aa/0xea0 [ 3.242245 ] Kernel panic - not syncing: scheduling while atomic [ 3.242245 ] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1-00145 -gee7dc339169-dirty #16 [ 3.242245 ] Call Trace: [ 3.242245 ] dump_stack+0xba/0xf5 [ 3.242245 ] ? ql3xxx_probe+0x1f0/0xea0 [ 3.242245 ] panic+0x15a/0x3f2 [ 3.242245 ] ? vprintk+0x76/0x150 [ 3.242245 ] ? ql3xxx_probe+0x2aa/0xea0 [ 3.242245 ] __schedule_bug+0xae/0xe0 [ 3.242245 ] __schedule+0x72e/0xa00 [ 3.242245 ] schedule+0x43/0xf0 [ 3.242245 ] schedule_timeout+0x28b/0x500 [ 3.242245 ] ? del_timer_sync+0xf0/0xf0 [ 3.242245 ] ? msleep+0x2f/0x70 [ 3.242245 ] msleep+0x59/0x70 [ 3.242245 ] ql3xxx_probe+0x307/0xea0 [ 3.242245 ] ? _raw_spin_unlock_irqrestore+0x3a/0x70 [ 3.242245 ] ? pci_device_remove+0x110/0x110 [ 3.242245 ] local_pci_probe+0x45/0xa0 [ 3.242245 ] pci_device_probe+0x12b/0x1d0 [ 3.242245 ] really_probe+0x2a9/0x610 [ 3.242245 ] driver_probe_device+0x90/0x1d0 [ 3.242245 ] ? mutex_lock_nested+0x1b/0x20 [ 3.242245 ] device_driver_attach+0x68/0x70 [ 3.242245 ] __driver_attach+0x124/0x1b0 [ 3.242245 ] ? device_driver_attach+0x70/0x70 [ 3.242245 ] bus_for_each_dev+0xbb/0x110 [ 3.242245 ] ? rdinit_setup+0x45/0x45 [ 3.242245 ] driver_attach+0x27/0x30 [ 3.242245 ] bus_add_driver+0x1eb/0x2a0 [ 3.242245 ] driver_register+0xa9/0x180 [ 3.242245 ] __pci_register_driver+0x82/0x90 [ 3.242245 ] ? yellowfin_init+0x25/0x25 [ 3.242245 ] ql3xxx_driver_init+0x23/0x25 [ 3.242245 ] do_one_initcall+0x7f/0x3d0 [ 3.242245 ] ? rdinit_setup+0x45/0x45 [ 3.242245 ] ? rcu_read_lock_sched_held+0x4f/0x80 [ 3.242245 ] kernel_init_freeable+0x2aa/0x301 [ 3.242245 ] ? rest_init+0x2c0/0x2c0 [ 3.242245 ] kernel_init+0x18/0x190 [ 3.242245 ] ? rest_init+0x2c0/0x2c0 [ 3.242245 ] ? rest_init+0x2c0/0x2c0 [ 3.242245 ] ret_from_fork+0x1f/0x30 [ 3.242245 ] Dumping ftrace buffer: [ 3.242245 ] (ftrace buffer empty) [ 3.242245 ] Kernel Offset: disabled [ 3.242245 ] Rebooting in 1 seconds.
Reported-by: Zheyu Ma zheyuma97@gmail.com Signed-off-by: Zheyu Ma zheyuma97@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qla3xxx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qla3xxx.c b/drivers/net/ethernet/qlogic/qla3xxx.c index f2cb77c3b199..192950a112c9 100644 --- a/drivers/net/ethernet/qlogic/qla3xxx.c +++ b/drivers/net/ethernet/qlogic/qla3xxx.c @@ -115,7 +115,7 @@ static int ql_sem_spinlock(struct ql3_adapter *qdev, value = readl(&port_regs->CommonRegs.semaphoreReg); if ((value & (sem_mask >> 16)) == sem_bits) return 0; - ssleep(1); + mdelay(1000); } while (--seconds); return -1; }
From: Matt Wang wwentao@vmware.com
[ Upstream commit e662502b3a782d479e67736a5a1c169a703d853a ]
Some commands (such as INQUIRY) may return less data than the initiator requested. To avoid conducting useless information, set the right residual count to make upper layer aware of this.
Before (INQUIRY PAGE 0xB0 with 128B buffer):
$ sg_raw -r 128 /dev/sda 12 01 B0 00 80 00 SCSI Status: Good
Received 128 bytes of data: 00 00 b0 00 3c 01 00 00 00 00 00 00 00 00 00 00 00 ...<............ 10 00 00 00 00 00 01 00 00 00 00 00 40 00 00 08 00 ...........@.... 20 80 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 .......... ..... 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 60 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
After:
$ sg_raw -r 128 /dev/sda 12 01 B0 00 80 00 SCSI Status: Good
Received 64 bytes of data: 00 00 b0 00 3c 01 00 00 00 00 00 00 00 00 00 00 00 ...<............ 10 00 00 00 00 00 01 00 00 00 00 00 40 00 00 08 00 ...........@.... 20 80 00 00 00 00 00 00 00 00 00 20 00 00 00 00 00 .......... ..... 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[mkp: clarified description]
Link: https://lore.kernel.org/r/03C41093-B62E-43A2-913E-CFC92F1C70C3@vmware.com Signed-off-by: Matt Wang wwentao@vmware.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/vmw_pvscsi.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/vmw_pvscsi.c b/drivers/scsi/vmw_pvscsi.c index df6fabcce4f7..4d2172c115c6 100644 --- a/drivers/scsi/vmw_pvscsi.c +++ b/drivers/scsi/vmw_pvscsi.c @@ -577,7 +577,13 @@ static void pvscsi_complete_request(struct pvscsi_adapter *adapter, case BTSTAT_SUCCESS: case BTSTAT_LINKED_COMMAND_COMPLETED: case BTSTAT_LINKED_COMMAND_COMPLETED_WITH_FLAG: - /* If everything went fine, let's move on.. */ + /* + * Commands like INQUIRY may transfer less data than + * requested by the initiator via bufflen. Set residual + * count to make upper layer aware of the actual amount + * of data returned. + */ + scsi_set_resid(cmd, scsi_bufflen(cmd) - e->dataLen); cmd->result = (DID_OK << 16); break;
From: Dmitry Bogdanov d.bogdanov@yadro.com
[ Upstream commit 2ef7665dfd88830f15415ba007c7c9a46be7acd8 ]
Target de-configuration panics at high CPU load because TPGT and WWPN can be removed on separate threads.
TPGT removal requests a reset HBA on a separate thread and waits for reset complete (phase1). Due to high CPU load that HBA reset can be delayed for some time.
WWPN removal does qlt_stop_phase2(). There it is believed that phase1 has already completed and thus tgt.tgt_ops is subsequently cleared. However, tgt.tgt_ops is needed to process incoming traffic and therefore this will cause one of the following panics:
NIP qlt_reset+0x7c/0x220 [qla2xxx] LR qlt_reset+0x68/0x220 [qla2xxx] Call Trace: 0xc000003ffff63a78 (unreliable) qlt_handle_imm_notify+0x800/0x10c0 [qla2xxx] qlt_24xx_atio_pkt+0x208/0x590 [qla2xxx] qlt_24xx_process_atio_queue+0x33c/0x7a0 [qla2xxx] qla83xx_msix_atio_q+0x54/0x90 [qla2xxx]
or
NIP qlt_24xx_handle_abts+0xd0/0x2a0 [qla2xxx] LR qlt_24xx_handle_abts+0xb4/0x2a0 [qla2xxx] Call Trace: qlt_24xx_handle_abts+0x90/0x2a0 [qla2xxx] (unreliable) qlt_24xx_process_atio_queue+0x500/0x7a0 [qla2xxx] qla83xx_msix_atio_q+0x54/0x90 [qla2xxx]
or
NIP qlt_create_sess+0x90/0x4e0 [qla2xxx] LR qla24xx_do_nack_work+0xa8/0x180 [qla2xxx] Call Trace: 0xc0000000348fba30 (unreliable) qla24xx_do_nack_work+0xa8/0x180 [qla2xxx] qla2x00_do_work+0x674/0xbf0 [qla2xxx] qla2x00_iocb_work_fn
The patch fixes the issue by serializing qlt_stop_phase1() and qlt_stop_phase2() functions to make WWPN removal wait for phase1 completion.
Link: https://lore.kernel.org/r/20210415203554.27890-1-d.bogdanov@yadro.com Reviewed-by: Roman Bolshakov r.bolshakov@yadro.com Signed-off-by: Dmitry Bogdanov d.bogdanov@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_target.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index b889caa556a0..6ef7a094ee51 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -1224,6 +1224,7 @@ void qlt_stop_phase2(struct qla_tgt *tgt) "Waiting for %d IRQ commands to complete (tgt %p)", tgt->irq_cmd_count, tgt);
+ mutex_lock(&tgt->ha->optrom_mutex); mutex_lock(&vha->vha_tgt.tgt_mutex); spin_lock_irqsave(&ha->hardware_lock, flags); while ((tgt->irq_cmd_count != 0) || (tgt->atio_irq_cmd_count != 0)) { @@ -1235,6 +1236,7 @@ void qlt_stop_phase2(struct qla_tgt *tgt) tgt->tgt_stopped = 1; spin_unlock_irqrestore(&ha->hardware_lock, flags); mutex_unlock(&vha->vha_tgt.tgt_mutex); + mutex_unlock(&tgt->ha->optrom_mutex);
ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00c, "Stop of tgt %p finished", tgt);
From: Zong Li zong.li@sifive.com
[ Upstream commit 5eff1461a6dec84f04fafa9128548bad51d96147 ]
If runtime power menagement is enabled, the gigabit ethernet PLL would be disabled after macb_probe(). During this period of time, the system would hang up if we try to access GEMGXL control registers.
We can't put runtime_pm_get/runtime_pm_put/ there due to the issue of sleep inside atomic section (7fa2955ff70ce453 ("sh_eth: Fix sleeping function called from invalid context"). Add netif_running checking to ensure the device is available before accessing GEMGXL device.
Changed in v2: - Use netif_running instead of its own flag
Signed-off-by: Zong Li zong.li@sifive.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cadence/macb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c index f20718b730e5..69fa47351a32 100644 --- a/drivers/net/ethernet/cadence/macb.c +++ b/drivers/net/ethernet/cadence/macb.c @@ -2031,6 +2031,9 @@ static struct net_device_stats *gem_get_stats(struct macb *bp) struct gem_stats *hwstat = &bp->hw_stats.gem; struct net_device_stats *nstat = &bp->stats;
+ if (!netif_running(bp->dev)) + return nstat; + gem_update_stats(bp);
nstat->rx_errors = (hwstat->rx_frame_check_sequence_errors +
From: Saubhik Mukherjee saubhik.mukherjee@gmail.com
[ Upstream commit a4dd4fc6105e54393d637450a11d4cddb5fabc4f ]
In cops_probe1(), there is a write to dev->base_addr after requesting an interrupt line and registering the interrupt handler cops_interrupt(). The handler might be called in parallel to handle an interrupt. cops_interrupt() tries to read dev->base_addr leading to a potential data race. So write to dev->base_addr before calling request_irq().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Saubhik Mukherjee saubhik.mukherjee@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/appletalk/cops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/appletalk/cops.c b/drivers/net/appletalk/cops.c index 1b2e9217ec78..d520ce32ddbf 100644 --- a/drivers/net/appletalk/cops.c +++ b/drivers/net/appletalk/cops.c @@ -324,6 +324,8 @@ static int __init cops_probe1(struct net_device *dev, int ioaddr) break; }
+ dev->base_addr = ioaddr; + /* Reserve any actual interrupt. */ if (dev->irq) { retval = request_irq(dev->irq, cops_interrupt, 0, dev->name, dev); @@ -331,8 +333,6 @@ static int __init cops_probe1(struct net_device *dev, int ioaddr) goto err_out; }
- dev->base_addr = ioaddr; - lp = netdev_priv(dev); spin_lock_init(&lp->lock);
From: Tiezhu Yang yangtiezhu@loongson.cn
[ Upstream commit 78cf0eb926cb1abeff2106bae67752e032fe5f3e ]
When update the latest mainline kernel with the following three configs, the kernel hangs during startup:
(1) CONFIG_FUNCTION_GRAPH_TRACER=y (2) CONFIG_PREEMPT_TRACER=y (3) CONFIG_FTRACE_STARTUP_TEST=y
When update the latest mainline kernel with the above two configs (1) and (2), the kernel starts normally, but it still hangs when execute the following command:
echo "function_graph" > /sys/kernel/debug/tracing/current_tracer
Without CONFIG_PREEMPT_TRACER=y, the above two kinds of kernel hangs disappeared, so it seems that CONFIG_PREEMPT_TRACER has some influences with function_graph tracer at the first glance.
I use ejtag to find out the epc address is related with preempt_enable() in the file arch/mips/lib/mips-atomic.c, because function tracing can trace the preempt_{enable,disable} calls that are traced, replace them with preempt_{enable,disable}_notrace to prevent function tracing from going into an infinite loop, and then it can fix the kernel hang issue.
By the way, it seems that this commit is a complement and improvement of commit f93a1a00f2bd ("MIPS: Fix crash that occurs when function tracing is enabled").
Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Cc: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/lib/mips-atomic.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/mips/lib/mips-atomic.c b/arch/mips/lib/mips-atomic.c index 5530070e0d05..57497a26e79c 100644 --- a/arch/mips/lib/mips-atomic.c +++ b/arch/mips/lib/mips-atomic.c @@ -37,7 +37,7 @@ */ notrace void arch_local_irq_disable(void) { - preempt_disable(); + preempt_disable_notrace();
__asm__ __volatile__( " .set push \n" @@ -53,7 +53,7 @@ notrace void arch_local_irq_disable(void) : /* no inputs */ : "memory");
- preempt_enable(); + preempt_enable_notrace(); } EXPORT_SYMBOL(arch_local_irq_disable);
@@ -61,7 +61,7 @@ notrace unsigned long arch_local_irq_save(void) { unsigned long flags;
- preempt_disable(); + preempt_disable_notrace();
__asm__ __volatile__( " .set push \n" @@ -78,7 +78,7 @@ notrace unsigned long arch_local_irq_save(void) : /* no inputs */ : "memory");
- preempt_enable(); + preempt_enable_notrace();
return flags; } @@ -88,7 +88,7 @@ notrace void arch_local_irq_restore(unsigned long flags) { unsigned long __tmp1;
- preempt_disable(); + preempt_disable_notrace();
__asm__ __volatile__( " .set push \n" @@ -106,7 +106,7 @@ notrace void arch_local_irq_restore(unsigned long flags) : "0" (flags) : "memory");
- preempt_enable(); + preempt_enable_notrace(); } EXPORT_SYMBOL(arch_local_irq_restore);
From: Jiapeng Chong jiapeng.chong@linux.alibaba.com
[ Upstream commit 65161c35554f7135e6656b3df1ce2c500ca0bdcf ]
Eliminate the follow smatch warning:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1227 bnx2x_iov_init_one() warn: missing error code 'err'.
Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Jiapeng Chong jiapeng.chong@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c index e8a09d0afe1c..545b59ff5d7e 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c @@ -1240,8 +1240,10 @@ int bnx2x_iov_init_one(struct bnx2x *bp, int int_mode_param, goto failed;
/* SR-IOV capability was enabled but there are no VFs*/ - if (iov->total == 0) + if (iov->total == 0) { + err = -EINVAL; goto failed; + }
iov->nr_virtfn = min_t(u16, iov->total, num_vfs_param);
From: Chris Packham chris.packham@alliedtelesis.co.nz
[ Upstream commit 7adc7b225cddcfd0f346d10144fd7a3d3d9f9ea7 ]
The i2c controllers on the P2040/P2041 have an erratum where the documented scheme for i2c bus recovery will not work (A-004447). A different mechanism is needed which is documented in the P2040 Chip Errata Rev Q (latest available at the time of writing).
Signed-off-by: Chris Packham chris.packham@alliedtelesis.co.nz Acked-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi index 51e975d7631a..8921f17fca42 100644 --- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi @@ -389,7 +389,23 @@ sdhc@114000 { };
/include/ "qoriq-i2c-0.dtsi" + i2c@118000 { + fsl,i2c-erratum-a004447; + }; + + i2c@118100 { + fsl,i2c-erratum-a004447; + }; + /include/ "qoriq-i2c-1.dtsi" + i2c@119000 { + fsl,i2c-erratum-a004447; + }; + + i2c@119100 { + fsl,i2c-erratum-a004447; + }; + /include/ "qoriq-duart-0.dtsi" /include/ "qoriq-duart-1.dtsi" /include/ "qoriq-gpio-0.dtsi"
From: Chris Packham chris.packham@alliedtelesis.co.nz
[ Upstream commit 19ae697a1e4edf1d755b413e3aa38da65e2db23b ]
The i2c controllers on the P1010 have an erratum where the documented scheme for i2c bus recovery will not work (A-004447). A different mechanism is needed which is documented in the P1010 Chip Errata Rev L.
Signed-off-by: Chris Packham chris.packham@alliedtelesis.co.nz Acked-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi index af12ead88c5f..404f570ebe23 100644 --- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi +++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi @@ -122,7 +122,15 @@ memory-controller@2000 { };
/include/ "pq3-i2c-0.dtsi" + i2c@3000 { + fsl,i2c-erratum-a004447; + }; + /include/ "pq3-i2c-1.dtsi" + i2c@3100 { + fsl,i2c-erratum-a004447; + }; + /include/ "pq3-duart-0.dtsi" /include/ "pq3-espi-0.dtsi" spi0: spi@7000 {
linux-stable-mirror@lists.linaro.org