From: Patrick Talbert ptalbert@redhat.com
[ Upstream commit 17c91487364fb33797ed84022564ee7544ac4945 ]
Now that ASPM is configured for *all* PCIe devices at boot, a problem is seen with systems that set the FADT NO_ASPM bit. This bit indicates that the OS should not alter the ASPM state, but when pcie_aspm_init_link_state() runs it only checks for !aspm_support_enabled. This misses the ACPI_FADT_NO_ASPM case because that is setting aspm_disabled.
The result is systems may hang at boot after 1302fcf; avoidable if they boot with pcie_aspm=off (sets !aspm_support_enabled).
Fix this by having aspm_init_link_state() check for either !aspm_support_enabled or acpm_disabled.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201001 Fixes: 1302fcf0d03e ("PCI: Configure *all* devices, not just hot-added ones") Signed-off-by: Patrick Talbert ptalbert@redhat.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/aspm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index c6a012b5ba390..6cc073f1d2d15 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -560,7 +560,7 @@ void pcie_aspm_init_link_state(struct pci_dev *pdev) struct pcie_link_state *link; int blacklist = !!pcie_aspm_sanity_check(pdev);
- if (!aspm_support_enabled) + if (!aspm_support_enabled || aspm_disabled) return;
if (pdev->link_state)
From: Zhang Rui rui.zhang@intel.com
[ Upstream commit aefa763b18a220f5fc1d5ab02af09158b6cc36ea ]
On Sony Vaio VPCEH3U1E, ACPI backlight control does not work, and native backlight works. Thus force use vendor backlight control on this system.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202401 Signed-off-by: Zhang Rui rui.zhang@intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/video_detect.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index 8c5503c0bad7c..3fc828e086b1e 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -135,6 +135,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "UL30A"), }, }, + { + .callback = video_detect_force_vendor, + .ident = "Sony VPCEH3U1E", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Sony Corporation"), + DMI_MATCH(DMI_PRODUCT_NAME, "VPCEH3U1E"), + }, + },
/* * These models have a working acpi_video backlight control, and using
From: Venkata Narendra Kumar Gutta vnkgutta@codeaurora.org
[ Upstream commit edb16da34b084c66763f29bee42b4e6bb33c3d66 ]
Platform core is using pdev->name as the platform device name to do the binding of the devices with the drivers. But, when the platform driver overrides the platform device name with dev_set_name(), the pdev->name is pointing to a location which is freed and becomes an invalid parameter to do the binding match.
use-after-free instance:
[ 33.325013] BUG: KASAN: use-after-free in strcmp+0x8c/0xb0 [ 33.330646] Read of size 1 at addr ffffffc10beae600 by task modprobe [ 33.339068] CPU: 5 PID: 518 Comm: modprobe Tainted: G S W O 4.19.30+ #3 [ 33.346835] Hardware name: MTP (DT) [ 33.350419] Call trace: [ 33.352941] dump_backtrace+0x0/0x3b8 [ 33.356713] show_stack+0x24/0x30 [ 33.360119] dump_stack+0x160/0x1d8 [ 33.363709] print_address_description+0x84/0x2e0 [ 33.368549] kasan_report+0x26c/0x2d0 [ 33.372322] __asan_report_load1_noabort+0x2c/0x38 [ 33.377248] strcmp+0x8c/0xb0 [ 33.380306] platform_match+0x70/0x1f8 [ 33.384168] __driver_attach+0x78/0x3a0 [ 33.388111] bus_for_each_dev+0x13c/0x1b8 [ 33.392237] driver_attach+0x4c/0x58 [ 33.395910] bus_add_driver+0x350/0x560 [ 33.399854] driver_register+0x23c/0x328 [ 33.403886] __platform_driver_register+0xd0/0xe0
So, use dev_name(&pdev->dev), which fetches the platform device name from the kobject(dev->kobj->name) of the device instead of the pdev->name.
Signed-off-by: Venkata Narendra Kumar Gutta vnkgutta@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/platform.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 065fcc4be263a..071dd053a917b 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -796,7 +796,7 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *a, if (len != -ENODEV) return len;
- len = snprintf(buf, PAGE_SIZE, "platform:%s\n", pdev->name); + len = snprintf(buf, PAGE_SIZE, "platform:%s\n", dev_name(&pdev->dev));
return (len >= PAGE_SIZE) ? (PAGE_SIZE - 1) : len; } @@ -872,7 +872,7 @@ static int platform_uevent(struct device *dev, struct kobj_uevent_env *env) return rc;
add_uevent_var(env, "MODALIAS=%s%s", PLATFORM_MODULE_PREFIX, - pdev->name); + dev_name(&pdev->dev)); return 0; }
@@ -881,7 +881,7 @@ static const struct platform_device_id *platform_match_id( struct platform_device *pdev) { while (id->name[0]) { - if (strcmp(pdev->name, id->name) == 0) { + if (strcmp(dev_name(&pdev->dev), id->name) == 0) { pdev->id_entry = id; return id; } @@ -925,7 +925,7 @@ static int platform_match(struct device *dev, struct device_driver *drv) return platform_match_id(pdrv->id_table, pdev) != NULL;
/* fall-back to driver name match */ - return (strcmp(pdev->name, drv->name) == 0); + return (strcmp(dev_name(&pdev->dev), drv->name) == 0); }
#ifdef CONFIG_PM_SLEEP
From: Daniel Axtens dja@axtens.net
[ Upstream commit 934bda59f286d0221f1a3ebab7f5156a996cc37d ]
While developing KASAN for 64-bit book3s, I hit the following stack over-read.
It occurs because the hypercall to put characters onto the terminal takes 2 longs (128 bits/16 bytes) of characters at a time, and so hvc_put_chars() would unconditionally copy 16 bytes from the argument buffer, regardless of supplied length. However, udbg_hvc_putc() can call hvc_put_chars() with a single-byte buffer, leading to the error.
================================================================== BUG: KASAN: stack-out-of-bounds in hvc_put_chars+0xdc/0x110 Read of size 8 at addr c0000000023e7a90 by task swapper/0
CPU: 0 PID: 0 Comm: swapper Not tainted 5.2.0-rc2-next-20190528-02824-g048a6ab4835b #113 Call Trace: dump_stack+0x104/0x154 (unreliable) print_address_description+0xa0/0x30c __kasan_report+0x20c/0x224 kasan_report+0x18/0x30 __asan_report_load8_noabort+0x24/0x40 hvc_put_chars+0xdc/0x110 hvterm_raw_put_chars+0x9c/0x110 udbg_hvc_putc+0x154/0x200 udbg_write+0xf0/0x240 console_unlock+0x868/0xd30 register_console+0x970/0xe90 register_early_udbg_console+0xf8/0x114 setup_arch+0x108/0x790 start_kernel+0x104/0x784 start_here_common+0x1c/0x534
Memory state around the buggy address: c0000000023e7980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0000000023e7a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
c0000000023e7a80: f1 f1 01 f2 f2 f2 00 00 00 00 00 00 00 00 00 00
^ c0000000023e7b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0000000023e7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ==================================================================
Document that a 16-byte buffer is requred, and provide it in udbg.
Signed-off-by: Daniel Axtens dja@axtens.net Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/hvconsole.c | 2 +- drivers/tty/hvc/hvc_vio.c | 16 +++++++++++++++- 2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/hvconsole.c b/arch/powerpc/platforms/pseries/hvconsole.c index 849b29b3e9ae0..954ef27128f22 100644 --- a/arch/powerpc/platforms/pseries/hvconsole.c +++ b/arch/powerpc/platforms/pseries/hvconsole.c @@ -62,7 +62,7 @@ EXPORT_SYMBOL(hvc_get_chars); * @vtermno: The vtermno or unit_address of the adapter from which the data * originated. * @buf: The character buffer that contains the character data to send to - * firmware. + * firmware. Must be at least 16 bytes, even if count is less than 16. * @count: Send this number of characters. */ int hvc_put_chars(uint32_t vtermno, const char *buf, int count) diff --git a/drivers/tty/hvc/hvc_vio.c b/drivers/tty/hvc/hvc_vio.c index f575a9b5ede7b..1d671d058dcb0 100644 --- a/drivers/tty/hvc/hvc_vio.c +++ b/drivers/tty/hvc/hvc_vio.c @@ -122,6 +122,14 @@ static int hvterm_raw_get_chars(uint32_t vtermno, char *buf, int count) return got; }
+/** + * hvterm_raw_put_chars: send characters to firmware for given vterm adapter + * @vtermno: The virtual terminal number. + * @buf: The characters to send. Because of the underlying hypercall in + * hvc_put_chars(), this buffer must be at least 16 bytes long, even if + * you are sending fewer chars. + * @count: number of chars to send. + */ static int hvterm_raw_put_chars(uint32_t vtermno, const char *buf, int count) { struct hvterm_priv *pv = hvterm_privs[vtermno]; @@ -234,6 +242,7 @@ static const struct hv_ops hvterm_hvsi_ops = { static void udbg_hvc_putc(char c) { int count = -1; + unsigned char bounce_buffer[16];
if (!hvterm_privs[0]) return; @@ -244,7 +253,12 @@ static void udbg_hvc_putc(char c) do { switch(hvterm_privs[0]->proto) { case HV_PROTOCOL_RAW: - count = hvterm_raw_put_chars(0, &c, 1); + /* + * hvterm_raw_put_chars requires at least a 16-byte + * buffer, so go via the bounce buffer + */ + bounce_buffer[0] = c; + count = hvterm_raw_put_chars(0, bounce_buffer, 1); break; case HV_PROTOCOL_HVSI: count = hvterm_hvsi_put_chars(0, &c, 1);
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit fd56141244066a6a21ef458670071c58b6402035 ]
The previous patch guarantees that srp_queuecommand() does not get invoked while reconnecting occurs. Hence remove the code from srp_queuecommand() that prevents command queueing while reconnecting. This patch avoids that the following can appear in the kernel log:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 in_atomic(): 1, irqs_disabled(): 0, pid: 5600, name: scsi_eh_9 1 lock held by scsi_eh_9/5600: #0: (rcu_read_lock){....}, at: [<00000000cbb798c7>] __blk_mq_run_hw_queue+0xf1/0x1e0 Preemption disabled at: [<00000000139badf2>] __blk_mq_delay_run_hw_queue+0x78/0xf0 CPU: 9 PID: 5600 Comm: scsi_eh_9 Tainted: G W 4.15.0-rc4-dbg+ #1 Hardware name: Dell Inc. PowerEdge R720/0VWT90, BIOS 2.5.4 01/22/2016 Call Trace: dump_stack+0x67/0x99 ___might_sleep+0x16a/0x250 [ib_srp] __mutex_lock+0x46/0x9d0 srp_queuecommand+0x356/0x420 [ib_srp] scsi_dispatch_cmd+0xf6/0x3f0 scsi_queue_rq+0x4a8/0x5f0 blk_mq_dispatch_rq_list+0x73/0x440 blk_mq_sched_dispatch_requests+0x109/0x1a0 __blk_mq_run_hw_queue+0x131/0x1e0 __blk_mq_delay_run_hw_queue+0x9a/0xf0 blk_mq_run_hw_queue+0xc0/0x1e0 blk_mq_start_hw_queues+0x2c/0x40 scsi_run_queue+0x18e/0x2d0 scsi_run_host_queues+0x22/0x40 scsi_error_handler+0x18d/0x5f0 kthread+0x11c/0x140 ret_from_fork+0x24/0x30
Reviewed-by: Hannes Reinecke hare@suse.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Laurence Oberman loberman@redhat.com Cc: Jason Gunthorpe jgg@mellanox.com Cc: Leon Romanovsky leonro@mellanox.com Cc: Doug Ledford dledford@redhat.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/ulp/srp/ib_srp.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 3dbc3ed263c21..9a70267244e1a 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2057,7 +2057,6 @@ static void srp_send_completion(struct ib_cq *cq, void *ch_ptr) static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) { struct srp_target_port *target = host_to_target(shost); - struct srp_rport *rport = target->rport; struct srp_rdma_ch *ch; struct srp_request *req; struct srp_iu *iu; @@ -2067,16 +2066,6 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) u32 tag; u16 idx; int len, ret; - const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler; - - /* - * The SCSI EH thread is the only context from which srp_queuecommand() - * can get invoked for blocked devices (SDEV_BLOCK / - * SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() by - * locking the rport mutex if invoked from inside the SCSI EH. - */ - if (in_scsi_eh) - mutex_lock(&rport->mutex);
scmnd->result = srp_chkready(target->rport); if (unlikely(scmnd->result)) @@ -2138,13 +2127,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) goto err_unmap; }
- ret = 0; - -unlock_rport: - if (in_scsi_eh) - mutex_unlock(&rport->mutex); - - return ret; + return 0;
err_unmap: srp_unmap_data(scmnd, ch, req); @@ -2166,7 +2149,7 @@ static int srp_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *scmnd) ret = SCSI_MLQUEUE_HOST_BUSY; }
- goto unlock_rport; + return ret; }
/*
From: Eric Dumazet edumazet@google.com
[ Upstream commit 159d2c7d8106177bd9a986fd005a311fe0d11285 ]
qdisc_root() use from netem_enqueue() triggers a lockdep warning.
__dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned.
WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838
stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sch_generic.h | 5 +++++ net/sched/sch_netem.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 7a5d6a0731654..ccd2a964dad7d 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -289,6 +289,11 @@ static inline struct Qdisc *qdisc_root(const struct Qdisc *qdisc) return q; }
+static inline struct Qdisc *qdisc_root_bh(const struct Qdisc *qdisc) +{ + return rcu_dereference_bh(qdisc->dev_queue->qdisc); +} + static inline struct Qdisc *qdisc_root_sleeping(const struct Qdisc *qdisc) { return qdisc->dev_queue->qdisc_sleeping; diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 2a431628af591..caf33af4f9a7d 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -464,7 +464,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) * skb will be queued. */ if (count > 1 && (skb2 = skb_clone(skb, GFP_ATOMIC)) != NULL) { - struct Qdisc *rootq = qdisc_root(sch); + struct Qdisc *rootq = qdisc_root_bh(sch); u32 dupsave = q->duplicate; /* prevent duplicating a dup... */
q->duplicate = 0;
From: Eric Biggers ebiggers@google.com
[ Upstream commit c6ee11c39fcc1fb55130748990a8f199e76263b4 ]
syzbot reported:
BUG: memory leak unreferenced object 0xffff888116270800 (size 224): comm "syz-executor641", pid 7047, jiffies 4294947360 (age 13.860s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 20 e1 2a 81 88 ff ff 00 40 3d 2a 81 88 ff ff . .*.....@=*.... backtrace: [<000000004d41b4cc>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000004d41b4cc>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000004d41b4cc>] slab_alloc_node mm/slab.c:3269 [inline] [<000000004d41b4cc>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000506a5965>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<000000001ba5a161>] alloc_skb include/linux/skbuff.h:1058 [inline] [<000000001ba5a161>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<0000000047d9c78b>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<000000003828fe54>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e34d94f9>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000de2de3fb>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000de2de3fb>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000008fe16e7a>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [...]
The bug is that llc_sap_state_process() always takes an extra reference to the skb, but sometimes neither llc_sap_next_state() nor llc_sap_state_process() itself drops this reference.
Fix it by changing llc_sap_next_state() to never consume a reference to the skb, rather than sometimes do so and sometimes not. Then remove the extra skb_get() and kfree_skb() from llc_sap_state_process().
Reported-by: syzbot+6bf095f9becf5efef645@syzkaller.appspotmail.com Reported-by: syzbot+31c16aa4202dace3812e@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/llc/llc_s_ac.c | 12 +++++++++--- net/llc/llc_sap.c | 23 ++++++++--------------- 2 files changed, 17 insertions(+), 18 deletions(-)
diff --git a/net/llc/llc_s_ac.c b/net/llc/llc_s_ac.c index a94bd56bcac6f..7ae4cc684d3ab 100644 --- a/net/llc/llc_s_ac.c +++ b/net/llc/llc_s_ac.c @@ -58,8 +58,10 @@ int llc_sap_action_send_ui(struct llc_sap *sap, struct sk_buff *skb) ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_ui_cmd(skb); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac); - if (likely(!rc)) + if (likely(!rc)) { + skb_get(skb); rc = dev_queue_xmit(skb); + } return rc; }
@@ -81,8 +83,10 @@ int llc_sap_action_send_xid_c(struct llc_sap *sap, struct sk_buff *skb) ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_xid_cmd(skb, LLC_XID_NULL_CLASS_2, 0); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac); - if (likely(!rc)) + if (likely(!rc)) { + skb_get(skb); rc = dev_queue_xmit(skb); + } return rc; }
@@ -135,8 +139,10 @@ int llc_sap_action_send_test_c(struct llc_sap *sap, struct sk_buff *skb) ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_test_cmd(skb); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac); - if (likely(!rc)) + if (likely(!rc)) { + skb_get(skb); rc = dev_queue_xmit(skb); + } return rc; }
diff --git a/net/llc/llc_sap.c b/net/llc/llc_sap.c index 5404d0d195cc5..d51ff9df9c959 100644 --- a/net/llc/llc_sap.c +++ b/net/llc/llc_sap.c @@ -197,29 +197,22 @@ static int llc_sap_next_state(struct llc_sap *sap, struct sk_buff *skb) * After executing actions of the event, upper layer will be indicated * if needed(on receiving an UI frame). sk can be null for the * datalink_proto case. + * + * This function always consumes a reference to the skb. */ static void llc_sap_state_process(struct llc_sap *sap, struct sk_buff *skb) { struct llc_sap_state_ev *ev = llc_sap_ev(skb);
- /* - * We have to hold the skb, because llc_sap_next_state - * will kfree it in the sending path and we need to - * look at the skb->cb, where we encode llc_sap_state_ev. - */ - skb_get(skb); ev->ind_cfm_flag = 0; llc_sap_next_state(sap, skb); - if (ev->ind_cfm_flag == LLC_IND) { - if (skb->sk->sk_state == TCP_LISTEN) - kfree_skb(skb); - else { - llc_save_primitive(skb->sk, skb, ev->prim);
- /* queue skb to the user. */ - if (sock_queue_rcv_skb(skb->sk, skb)) - kfree_skb(skb); - } + if (ev->ind_cfm_flag == LLC_IND && skb->sk->sk_state != TCP_LISTEN) { + llc_save_primitive(skb->sk, skb, ev->prim); + + /* queue skb to the user. */ + if (sock_queue_rcv_skb(skb->sk, skb) == 0) + return; } kfree_skb(skb); }
From: Eric Biggers ebiggers@google.com
[ Upstream commit b74555de21acd791f12c4a1aeaf653dd7ac21133 ]
syzbot reported:
BUG: memory leak unreferenced object 0xffff88811eb3de00 (size 224): comm "syz-executor559", pid 7315, jiffies 4294943019 (age 10.300s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 a0 38 24 81 88 ff ff 00 c0 f2 15 81 88 ff ff ..8$............ backtrace: [<000000008d1c66a1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000008d1c66a1>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000008d1c66a1>] slab_alloc_node mm/slab.c:3269 [inline] [<000000008d1c66a1>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000447d9496>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<000000000cdbf82f>] alloc_skb include/linux/skbuff.h:1058 [inline] [<000000000cdbf82f>] llc_alloc_frame+0x66/0x110 net/llc/llc_sap.c:54 [<000000002418b52e>] llc_conn_ac_send_sabme_cmd_p_set_x+0x2f/0x140 net/llc/llc_c_ac.c:777 [<000000001372ae17>] llc_exec_conn_trans_actions net/llc/llc_conn.c:475 [inline] [<000000001372ae17>] llc_conn_service net/llc/llc_conn.c:400 [inline] [<000000001372ae17>] llc_conn_state_process+0x1ac/0x640 net/llc/llc_conn.c:75 [<00000000f27e53c1>] llc_establish_connection+0x110/0x170 net/llc/llc_if.c:109 [<00000000291b2ca0>] llc_ui_connect+0x10e/0x370 net/llc/af_llc.c:477 [<000000000f9c740b>] __sys_connect+0x11d/0x170 net/socket.c:1840 [...]
The bug is that most callers of llc_conn_send_pdu() assume it consumes a reference to the skb, when actually due to commit b85ab56c3f81 ("llc: properly handle dev_queue_xmit() return value") it doesn't.
Revert most of that commit, and instead make the few places that need llc_conn_send_pdu() to *not* consume a reference call skb_get() before.
Fixes: b85ab56c3f81 ("llc: properly handle dev_queue_xmit() return value") Reported-by: syzbot+6b825a6494a04cc0e3f7@syzkaller.appspotmail.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/llc_conn.h | 2 +- net/llc/llc_c_ac.c | 8 ++++++-- net/llc/llc_conn.c | 32 +++++++++----------------------- 3 files changed, 16 insertions(+), 26 deletions(-)
diff --git a/include/net/llc_conn.h b/include/net/llc_conn.h index df528a6235487..ea985aa7a6c5e 100644 --- a/include/net/llc_conn.h +++ b/include/net/llc_conn.h @@ -104,7 +104,7 @@ void llc_sk_reset(struct sock *sk);
/* Access to a connection */ int llc_conn_state_process(struct sock *sk, struct sk_buff *skb); -int llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb); +void llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb); void llc_conn_rtn_pdu(struct sock *sk, struct sk_buff *skb); void llc_conn_resend_i_pdu_as_cmd(struct sock *sk, u8 nr, u8 first_p_bit); void llc_conn_resend_i_pdu_as_rsp(struct sock *sk, u8 nr, u8 first_f_bit); diff --git a/net/llc/llc_c_ac.c b/net/llc/llc_c_ac.c index 4b60f68cb4925..8354ae40ec85b 100644 --- a/net/llc/llc_c_ac.c +++ b/net/llc/llc_c_ac.c @@ -372,6 +372,7 @@ int llc_conn_ac_send_i_cmd_p_set_1(struct sock *sk, struct sk_buff *skb) llc_pdu_init_as_i_cmd(skb, 1, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { + skb_get(skb); llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } @@ -389,7 +390,8 @@ static int llc_conn_ac_send_i_cmd_p_set_0(struct sock *sk, struct sk_buff *skb) llc_pdu_init_as_i_cmd(skb, 0, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { - rc = llc_conn_send_pdu(sk, skb); + skb_get(skb); + llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } return rc; @@ -406,6 +408,7 @@ int llc_conn_ac_send_i_xxx_x_set_0(struct sock *sk, struct sk_buff *skb) llc_pdu_init_as_i_cmd(skb, 0, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { + skb_get(skb); llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } @@ -916,7 +919,8 @@ static int llc_conn_ac_send_i_rsp_f_set_ackpf(struct sock *sk, llc_pdu_init_as_i_cmd(skb, llc->ack_pf, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { - rc = llc_conn_send_pdu(sk, skb); + skb_get(skb); + llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } return rc; diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c index 79c346fd859b2..d861b74ad068c 100644 --- a/net/llc/llc_conn.c +++ b/net/llc/llc_conn.c @@ -30,7 +30,7 @@ #endif
static int llc_find_offset(int state, int ev_type); -static int llc_conn_send_pdus(struct sock *sk, struct sk_buff *skb); +static void llc_conn_send_pdus(struct sock *sk); static int llc_conn_service(struct sock *sk, struct sk_buff *skb); static int llc_exec_conn_trans_actions(struct sock *sk, struct llc_conn_state_trans *trans, @@ -193,11 +193,11 @@ int llc_conn_state_process(struct sock *sk, struct sk_buff *skb) return rc; }
-int llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb) +void llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb) { /* queue PDU to send to MAC layer */ skb_queue_tail(&sk->sk_write_queue, skb); - return llc_conn_send_pdus(sk, skb); + llc_conn_send_pdus(sk); }
/** @@ -255,7 +255,7 @@ void llc_conn_resend_i_pdu_as_cmd(struct sock *sk, u8 nr, u8 first_p_bit) if (howmany_resend > 0) llc->vS = (llc->vS + 1) % LLC_2_SEQ_NBR_MODULO; /* any PDUs to re-send are queued up; start sending to MAC */ - llc_conn_send_pdus(sk, NULL); + llc_conn_send_pdus(sk); out:; }
@@ -296,7 +296,7 @@ void llc_conn_resend_i_pdu_as_rsp(struct sock *sk, u8 nr, u8 first_f_bit) if (howmany_resend > 0) llc->vS = (llc->vS + 1) % LLC_2_SEQ_NBR_MODULO; /* any PDUs to re-send are queued up; start sending to MAC */ - llc_conn_send_pdus(sk, NULL); + llc_conn_send_pdus(sk); out:; }
@@ -340,16 +340,12 @@ int llc_conn_remove_acked_pdus(struct sock *sk, u8 nr, u16 *how_many_unacked) /** * llc_conn_send_pdus - Sends queued PDUs * @sk: active connection - * @hold_skb: the skb held by caller, or NULL if does not care * - * Sends queued pdus to MAC layer for transmission. When @hold_skb is - * NULL, always return 0. Otherwise, return 0 if @hold_skb is sent - * successfully, or 1 for failure. + * Sends queued pdus to MAC layer for transmission. */ -static int llc_conn_send_pdus(struct sock *sk, struct sk_buff *hold_skb) +static void llc_conn_send_pdus(struct sock *sk) { struct sk_buff *skb; - int ret = 0;
while ((skb = skb_dequeue(&sk->sk_write_queue)) != NULL) { struct llc_pdu_sn *pdu = llc_pdu_sn_hdr(skb); @@ -361,20 +357,10 @@ static int llc_conn_send_pdus(struct sock *sk, struct sk_buff *hold_skb) skb_queue_tail(&llc_sk(sk)->pdu_unack_q, skb); if (!skb2) break; - dev_queue_xmit(skb2); - } else { - bool is_target = skb == hold_skb; - int rc; - - if (is_target) - skb_get(skb); - rc = dev_queue_xmit(skb); - if (is_target) - ret = rc; + skb = skb2; } + dev_queue_xmit(skb); } - - return ret; }
/**
From: Eric Dumazet edumazet@google.com
[ Upstream commit a7137534b597b7c303203e6bc3ed87e87a273bb8 ]
syzbot got a NULL dereference in bond_update_slave_arr() [1], happening after a failure to allocate bond->slave_arr
A workqueue (bond_slave_arr_handler) is supposed to retry the allocation later, but if the slave is removed before the workqueue had a chance to complete, bond->slave_arr can still be NULL.
[1]
Failed to build slave-array. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI Modules linked in: Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:bond_update_slave_arr.cold+0xc6/0x198 drivers/net/bonding/bond_main.c:4039 RSP: 0018:ffff88018fe33678 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000290b000 RDX: 0000000000000000 RSI: ffffffff82b63037 RDI: ffff88019745ea20 RBP: ffff88018fe33760 R08: ffff880170754280 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88019745ea00 R14: 0000000000000000 R15: ffff88018fe338b0 FS: 00007febd837d700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004540a0 CR3: 00000001c242e005 CR4: 00000000001626f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff82b5b45e>] __bond_release_one+0x43e/0x500 drivers/net/bonding/bond_main.c:1923 [<ffffffff82b5b966>] bond_release drivers/net/bonding/bond_main.c:2039 [inline] [<ffffffff82b5b966>] bond_do_ioctl+0x416/0x870 drivers/net/bonding/bond_main.c:3562 [<ffffffff83ae25f4>] dev_ifsioc+0x6f4/0x940 net/core/dev_ioctl.c:328 [<ffffffff83ae2e58>] dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:495 [<ffffffff83995ffd>] sock_do_ioctl+0x1bd/0x300 net/socket.c:1088 [<ffffffff83996a80>] sock_ioctl+0x300/0x5d0 net/socket.c:1196 [<ffffffff81b124db>] vfs_ioctl fs/ioctl.c:47 [inline] [<ffffffff81b124db>] file_ioctl fs/ioctl.c:501 [inline] [<ffffffff81b124db>] do_vfs_ioctl+0xacb/0x1300 fs/ioctl.c:688 [<ffffffff81b12dc6>] SYSC_ioctl fs/ioctl.c:705 [inline] [<ffffffff81b12dc6>] SyS_ioctl+0xb6/0xe0 fs/ioctl.c:696 [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305 [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Mahesh Bandewar maheshb@google.com Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index fd6aff9f0052e..1bf4f54c2befb 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3889,7 +3889,7 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave) * this to-be-skipped slave to send a packet out. */ old_arr = rtnl_dereference(bond->slave_arr); - for (idx = 0; idx < old_arr->count; idx++) { + for (idx = 0; old_arr != NULL && idx < old_arr->count; idx++) { if (skipslave == old_arr->arr[idx]) { old_arr->arr[idx] = old_arr->arr[old_arr->count-1];
From: Valentin Vidic vvidic@valentin-vidic.from.hr
[ Upstream commit 77b6d09f4ae66d42cd63b121af67780ae3d1a5e9 ]
Make sure res does not contain random value if the call to sr_read_cmd fails for some reason.
Reported-by: syzbot+f1842130bbcfb335bac1@syzkaller.appspotmail.com Signed-off-by: Valentin Vidic vvidic@valentin-vidic.from.hr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/sr9800.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/usb/sr9800.c b/drivers/net/usb/sr9800.c index 004c955c1fd1b..da0ae16f5c74c 100644 --- a/drivers/net/usb/sr9800.c +++ b/drivers/net/usb/sr9800.c @@ -336,7 +336,7 @@ static void sr_set_multicast(struct net_device *net) static int sr_mdio_read(struct net_device *net, int phy_id, int loc) { struct usbnet *dev = netdev_priv(net); - __le16 res; + __le16 res = 0;
mutex_lock(&dev->phy_mutex); sr_set_sw_mii(dev);
From: Chandan Rajendra chandan@linux.ibm.com
[ Upstream commit 547b9ad698b434eadca46319cb47e5875b55ef03 ]
When executing generic/388 on a ppc64le machine, we notice the following call trace,
VFS: brelse: Trying to free free buffer WARNING: CPU: 0 PID: 6637 at /root/repos/linux/fs/buffer.c:1195 __brelse+0x84/0xc0
Call Trace: __brelse+0x80/0xc0 (unreliable) invalidate_bh_lru+0x78/0xc0 on_each_cpu_mask+0xa8/0x130 on_each_cpu_cond_mask+0x130/0x170 invalidate_bh_lrus+0x44/0x60 invalidate_bdev+0x38/0x70 ext4_put_super+0x294/0x560 generic_shutdown_super+0xb0/0x170 kill_block_super+0x38/0xb0 deactivate_locked_super+0xa4/0xf0 cleanup_mnt+0x164/0x1d0 task_work_run+0x110/0x160 do_notify_resume+0x414/0x460 ret_from_except_lite+0x70/0x74
The warning happens because flush_descriptor() drops bh reference it does not own. The bh reference acquired by jbd2_journal_get_descriptor_buffer() is owned by the log_bufs list and gets released when this list is processed. The reference for doing IO is only acquired in write_dirty_buffer() later in flush_descriptor().
Reported-by: Harish Sriram harish@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Chandan Rajendra chandan@linux.ibm.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/revoke.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c index 705ae577882b1..8ad752c4bee31 100644 --- a/fs/jbd2/revoke.c +++ b/fs/jbd2/revoke.c @@ -658,10 +658,8 @@ static void flush_descriptor(journal_t *journal, { jbd2_journal_revoke_header_t *header;
- if (is_journal_aborted(journal)) { - put_bh(descriptor); + if (is_journal_aborted(journal)) return; - }
header = (jbd2_journal_revoke_header_t *)descriptor->b_data; header->r_count = cpu_to_be32(offset);
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit e1aa1a1db3b01c9890e82cf065cee99962ba1ed9 ]
Fix following lockdep warning disabling bh in ath_dynack_node_init/ath_dynack_node_deinit
[ 75.955878] -------------------------------- [ 75.955880] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 75.955884] swapper/0/0 [HC0[0]:SC1[3]:HE1:SE0] takes: [ 75.955888] 00000000792a7ee0 (&(&da->qlock)->rlock){+.?.}, at: ath_dynack_sample_ack_ts+0x4d/0xa0 [ath9k_hw] [ 75.955905] {SOFTIRQ-ON-W} state was registered at: [ 75.955912] lock_acquire+0x9a/0x160 [ 75.955917] _raw_spin_lock+0x2c/0x70 [ 75.955927] ath_dynack_node_init+0x2a/0x60 [ath9k_hw] [ 75.955934] ath9k_sta_state+0xec/0x160 [ath9k] [ 75.955976] drv_sta_state+0xb2/0x740 [mac80211] [ 75.956008] sta_info_insert_finish+0x21a/0x420 [mac80211] [ 75.956039] sta_info_insert_rcu+0x12b/0x2c0 [mac80211] [ 75.956069] sta_info_insert+0x7/0x70 [mac80211] [ 75.956093] ieee80211_prep_connection+0x42e/0x730 [mac80211] [ 75.956120] ieee80211_mgd_auth.cold+0xb9/0x15c [mac80211] [ 75.956152] cfg80211_mlme_auth+0x143/0x350 [cfg80211] [ 75.956169] nl80211_authenticate+0x25e/0x2b0 [cfg80211] [ 75.956172] genl_family_rcv_msg+0x198/0x400 [ 75.956174] genl_rcv_msg+0x42/0x90 [ 75.956176] netlink_rcv_skb+0x35/0xf0 [ 75.956178] genl_rcv+0x1f/0x30 [ 75.956180] netlink_unicast+0x154/0x200 [ 75.956182] netlink_sendmsg+0x1bf/0x3d0 [ 75.956186] ___sys_sendmsg+0x2c2/0x2f0 [ 75.956187] __sys_sendmsg+0x44/0x80 [ 75.956190] do_syscall_64+0x55/0x1a0 [ 75.956192] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 75.956194] irq event stamp: 2357092 [ 75.956196] hardirqs last enabled at (2357092): [<ffffffff818c62de>] _raw_spin_unlock_irqrestore+0x3e/0x50 [ 75.956199] hardirqs last disabled at (2357091): [<ffffffff818c60b1>] _raw_spin_lock_irqsave+0x11/0x80 [ 75.956202] softirqs last enabled at (2357072): [<ffffffff8106dc09>] irq_enter+0x59/0x60 [ 75.956204] softirqs last disabled at (2357073): [<ffffffff8106dcbe>] irq_exit+0xae/0xc0 [ 75.956206] other info that might help us debug this: [ 75.956207] Possible unsafe locking scenario:
[ 75.956208] CPU0 [ 75.956209] ---- [ 75.956210] lock(&(&da->qlock)->rlock); [ 75.956213] <Interrupt> [ 75.956214] lock(&(&da->qlock)->rlock); [ 75.956216] *** DEADLOCK ***
[ 75.956217] 1 lock held by swapper/0/0: [ 75.956219] #0: 000000003bb5675c (&(&sc->sc_pcu_lock)->rlock){+.-.}, at: ath9k_tasklet+0x55/0x240 [ath9k] [ 75.956225] stack backtrace: [ 75.956228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc1-wdn+ #13 [ 75.956229] Hardware name: Dell Inc. Studio XPS 1340/0K183D, BIOS A11 09/08/2009 [ 75.956231] Call Trace: [ 75.956233] <IRQ> [ 75.956236] dump_stack+0x67/0x90 [ 75.956239] mark_lock+0x4c1/0x640 [ 75.956242] ? check_usage_backwards+0x130/0x130 [ 75.956245] ? sched_clock_local+0x12/0x80 [ 75.956247] __lock_acquire+0x484/0x7a0 [ 75.956250] ? __lock_acquire+0x3b9/0x7a0 [ 75.956252] lock_acquire+0x9a/0x160 [ 75.956259] ? ath_dynack_sample_ack_ts+0x4d/0xa0 [ath9k_hw] [ 75.956262] _raw_spin_lock_bh+0x34/0x80 [ 75.956268] ? ath_dynack_sample_ack_ts+0x4d/0xa0 [ath9k_hw] [ 75.956275] ath_dynack_sample_ack_ts+0x4d/0xa0 [ath9k_hw] [ 75.956280] ath_rx_tasklet+0xd09/0xe90 [ath9k] [ 75.956286] ath9k_tasklet+0x102/0x240 [ath9k] [ 75.956288] tasklet_action_common.isra.0+0x6d/0x170 [ 75.956291] __do_softirq+0xcc/0x425 [ 75.956294] irq_exit+0xae/0xc0 [ 75.956296] do_IRQ+0x8a/0x110 [ 75.956298] common_interrupt+0xf/0xf [ 75.956300] </IRQ> [ 75.956303] RIP: 0010:cpuidle_enter_state+0xb2/0x400 [ 75.956308] RSP: 0018:ffffffff82203e70 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffd7 [ 75.956310] RAX: ffffffff82219800 RBX: ffffffff822bd0a0 RCX: 0000000000000000 [ 75.956312] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffffff82219800 [ 75.956314] RBP: ffff888155a01c00 R08: 00000011af51aabe R09: 0000000000000000 [ 75.956315] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 [ 75.956317] R13: 00000011af51aabe R14: 0000000000000003 R15: ffffffff82219800 [ 75.956321] cpuidle_enter+0x24/0x40 [ 75.956323] do_idle+0x1ac/0x220 [ 75.956326] cpu_startup_entry+0x14/0x20 [ 75.956329] start_kernel+0x482/0x489 [ 75.956332] secondary_startup_64+0xa4/0xb0
Fixes: c774d57fd47c ("ath9k: add dynamic ACK timeout estimation") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Tested-by: Koen Vandeputte koen.vandeputte@ncentric.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/dynack.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/dynack.c b/drivers/net/wireless/ath/ath9k/dynack.c index 22b3cc4c27cda..58205a5bd74b5 100644 --- a/drivers/net/wireless/ath/ath9k/dynack.c +++ b/drivers/net/wireless/ath/ath9k/dynack.c @@ -285,9 +285,9 @@ void ath_dynack_node_init(struct ath_hw *ah, struct ath_node *an)
an->ackto = ackto;
- spin_lock(&da->qlock); + spin_lock_bh(&da->qlock); list_add_tail(&an->list, &da->nodes); - spin_unlock(&da->qlock); + spin_unlock_bh(&da->qlock); } EXPORT_SYMBOL(ath_dynack_node_init);
@@ -301,9 +301,9 @@ void ath_dynack_node_deinit(struct ath_hw *ah, struct ath_node *an) { struct ath_dynack *da = &ah->dynack;
- spin_lock(&da->qlock); + spin_lock_bh(&da->qlock); list_del(&an->list); - spin_unlock(&da->qlock); + spin_unlock_bh(&da->qlock); } EXPORT_SYMBOL(ath_dynack_node_deinit);
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 7764d56baa844d7f6206394f21a0e8c1f303c476 ]
If we are able to load an existing inode cache off disk, we set the state of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting for the cache to be available. This means that anyone waiting for the cache to be available, waiting on the condition that either its state is BTRFS_CACHE_FINISHED or its available free space is greather than zero, can hang forever.
This could be observed running fstests with MOUNT_OPTIONS="-o inode_cache", in particular test case generic/161 triggered it very frequently for me, producing a trace like the following:
[63795.739712] BTRFS info (device sdc): enabling inode map caching [63795.739714] BTRFS info (device sdc): disk space caching is enabled [63795.739716] BTRFS info (device sdc): has skinny extents [64036.653886] INFO: task btrfs-transacti:3917 blocked for more than 120 seconds. [64036.654079] Not tainted 5.2.0-rc4-btrfs-next-50 #1 [64036.654143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [64036.654232] btrfs-transacti D 0 3917 2 0x80004000 [64036.654239] Call Trace: [64036.654258] ? __schedule+0x3ae/0x7b0 [64036.654271] schedule+0x3a/0xb0 [64036.654325] btrfs_commit_transaction+0x978/0xae0 [btrfs] [64036.654339] ? remove_wait_queue+0x60/0x60 [64036.654395] transaction_kthread+0x146/0x180 [btrfs] [64036.654450] ? btrfs_cleanup_transaction+0x620/0x620 [btrfs] [64036.654456] kthread+0x103/0x140 [64036.654464] ? kthread_create_worker_on_cpu+0x70/0x70 [64036.654476] ret_from_fork+0x3a/0x50 [64036.654504] INFO: task xfs_io:3919 blocked for more than 120 seconds. [64036.654568] Not tainted 5.2.0-rc4-btrfs-next-50 #1 [64036.654617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [64036.654685] xfs_io D 0 3919 3633 0x00000000 [64036.654691] Call Trace: [64036.654703] ? __schedule+0x3ae/0x7b0 [64036.654716] schedule+0x3a/0xb0 [64036.654756] btrfs_find_free_ino+0xa9/0x120 [btrfs] [64036.654764] ? remove_wait_queue+0x60/0x60 [64036.654809] btrfs_create+0x72/0x1f0 [btrfs] [64036.654822] lookup_open+0x6bc/0x790 [64036.654849] path_openat+0x3bc/0xc00 [64036.654854] ? __lock_acquire+0x331/0x1cb0 [64036.654869] do_filp_open+0x99/0x110 [64036.654884] ? __alloc_fd+0xee/0x200 [64036.654895] ? do_raw_spin_unlock+0x49/0xc0 [64036.654909] ? do_sys_open+0x132/0x220 [64036.654913] do_sys_open+0x132/0x220 [64036.654926] do_syscall_64+0x60/0x1d0 [64036.654933] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fix this by adding a wake_up() call right after setting the cache state to BTRFS_CACHE_FINISHED, at start_caching(), when we are able to load the cache from disk.
Fixes: 82d5902d9c681b ("Btrfs: Support reading/writing on disk free ino cache") Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode-map.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index 07573dc1614ab..3469c7ce7cb6d 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -158,6 +158,7 @@ static void start_caching(struct btrfs_root *root) spin_lock(&root->ino_cache_lock); root->ino_cache_state = BTRFS_CACHE_FINISHED; spin_unlock(&root->ino_cache_lock); + wake_up(&root->ino_cache_wait); return; }
From: Zhihao Cheng chengzhihao1@huawei.com
[ Upstream commit 8615b94f029a4fb4306d3512aaf1c45f5fc24d4b ]
Running stress test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0
We may see the following verbose: (output) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-00001-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440
In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool.
This patch increases the number of attempts to obtain PEB. An extreme case (No free PEBs left after creating test volumes) has been tested on different type of machines for 100 times. The biggest number of attempts are shown below:
x86_64 arm64 2-core 4 4 4-core 8 4 8-core 4 4
Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/ubi/fastmap-wl.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index 69dd21679a30c..99aa30b28e46e 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -205,7 +205,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret, attempts = 0; struct ubi_fm_pool *pool = &ubi->fm_pool; struct ubi_fm_pool *wl_pool = &ubi->fm_wl_pool;
@@ -230,12 +230,12 @@ int ubi_wl_get_peb(struct ubi_device *ubi)
if (pool->used == pool->size) { spin_unlock(&ubi->wl_lock); - if (retried) { + attempts++; + if (attempts == 10) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); ret = -ENOSPC; goto out; } - retried = 1; up_read(&ubi->fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) {
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ]
This patch fixes the lock inversion complaint:
============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ #1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm]
but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 #1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0 #2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30
This is not a bug as there are actually two lock classes here.
Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd92137 ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 1454290078def..8ad9c6b04769d 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1976,9 +1976,10 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, conn_id->cm_id.iw = NULL; cma_exch(conn_id, RDMA_CM_DESTROYING); mutex_unlock(&conn_id->handler_mutex); + mutex_unlock(&listen_id->handler_mutex); cma_deref_id(conn_id); rdma_destroy_id(&conn_id->id); - goto out; + return ret; }
mutex_unlock(&conn_id->handler_mutex);
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit a2b90f11217790ec0964ba9c93a4abb369758c26 ]
A removable block device, such as NVMe or SSD connected over Thunderbolt can be hot-removed any time including when the system is suspended. When device is hot-removed during suspend and the system gets resumed, kernel first resumes devices and then thaws the userspace including freezable workqueues. What happens in that case is that the NVMe driver notices that the device is unplugged and removes it from the system. This ends up calling bdi_unregister() for the gendisk which then schedules wb_workfn() to be run one more time.
However, since the bdi_wq is still frozen flush_delayed_work() call in wb_shutdown() blocks forever halting system resume process. User sees this as hang as nothing is happening anymore.
Triggering sysrq-w reveals this:
Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme] Call Trace: ? __schedule+0x2c5/0x630 ? wait_for_completion+0xa4/0x120 schedule+0x3e/0xc0 schedule_timeout+0x1c9/0x320 ? resched_curr+0x1f/0xd0 ? wait_for_completion+0xa4/0x120 wait_for_completion+0xc3/0x120 ? wake_up_q+0x60/0x60 __flush_work+0x131/0x1e0 ? flush_workqueue_prep_pwqs+0x130/0x130 bdi_unregister+0xb9/0x130 del_gendisk+0x2d2/0x2e0 nvme_ns_remove+0xed/0x110 [nvme_core] nvme_remove_namespaces+0x96/0xd0 [nvme_core] nvme_remove+0x5b/0x160 [nvme] pci_device_remove+0x36/0x90 device_release_driver_internal+0xdf/0x1c0 nvme_remove_dead_ctrl_work+0x14/0x30 [nvme] process_one_work+0x1c2/0x3f0 worker_thread+0x48/0x3e0 kthread+0x100/0x140 ? current_work+0x30/0x30 ? kthread_park+0x80/0x80 ret_from_fork+0x35/0x40
This is not limited to NVMes so exactly same issue can be reproduced by hot-removing SSD (over Thunderbolt) while the system is suspended.
Prevent this from happening by removing WQ_FREEZABLE from bdi_wq.
Reported-by: AceLan Kao acelan.kao@canonical.com Link: https://marc.info/?l=linux-kernel&m=138695698516487 Link: https://bugzilla.kernel.org/show_bug.cgi?id=204385 Link: https://lore.kernel.org/lkml/20191002122136.GD2819@lahna.fi.intel.com/#t Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- mm/backing-dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 07e3b3b8e8469..c682fb90cf356 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -245,8 +245,8 @@ static int __init default_bdi_init(void) { int err;
- bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_FREEZABLE | - WQ_UNBOUND | WQ_SYSFS, 0); + bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_UNBOUND | + WQ_SYSFS, 0); if (!bdi_wq) return -ENOMEM;
linux-stable-mirror@lists.linaro.org