This is the start of the stable review cycle for the 4.9.287 release. There are 25 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 16 Oct 2021 14:51:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.287-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.9.287-rc1
Anand K Mistry amistry@google.com perf/x86: Reset destroy callback on event init failure
Colin Ian King colin.king@canonical.com scsi: virtio_scsi: Fix spelling mistake "Unsupport" -> "Unsupported"
Jiapeng Chong jiapeng.chong@linux.alibaba.com scsi: ses: Fix unsigned comparison with less than zero
YueHaibing yuehaibing@huawei.com mac80211: Drop frames from invalid MAC address in ad-hoc mode
Jeremy Sowden jeremy@azazel.net netfilter: ip6_tables: zero-initialize fragment offset
Mizuho Mori morimolymoly@gmail.com HID: apple: Fix logical maximum and usage maximum of Magic Keyboard JIS
Linus Torvalds torvalds@linux-foundation.org gup: document and work around "COW can break either way" issue
Jiri Benc jbenc@redhat.com i40e: fix endless loop under rtnl
Eric Dumazet edumazet@google.com rtnetlink: fix if_nlmsg_stats_size() under estimation
Yang Yingliang yangyingliang@huawei.com drm/nouveau/debugfs: fix file release memory leak
Eric Dumazet edumazet@google.com netlink: annotate data races around nlk->bound
Eric Dumazet edumazet@google.com net: bridge: use nla_total_size_64bit() in br_get_linkxstats_size()
Oleksij Rempel o.rempel@pengutronix.de ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence
Andy Shevchenko andriy.shevchenko@linux.intel.com ptp_pch: Load module automatically if ID matches
Pali Rohár pali@kernel.org powerpc/fsl/dts: Fix phy-connection-type for fm1mac3
Eric Dumazet edumazet@google.com net_sched: fix NULL deref in fifo_set_limit()
Pavel Skripkin paskripkin@gmail.com phy: mdio: fix memory leak
Tatsuhiko Yasumatsu th.yasumatsu@gmail.com bpf: Fix integer overflow in prealloc_elems_and_freelist()
Max Filippov jcmvbkbc@gmail.com xtensa: call irqchip_init only when CONFIG_USE_OF is selected
Roger Quadros rogerq@kernel.org ARM: dts: omap3430-sdp: Fix NAND device node
Trond Myklebust trond.myklebust@hammerspace.com nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero
Zheng Liang zhengliang6@huawei.com ovl: fix missing negative dentry check in ovl_rename()
Johan Hovold johan@kernel.org USB: cdc-acm: fix break reporting
Johan Hovold johan@kernel.org USB: cdc-acm: fix racy tty buffer accesses
Ben Hutchings ben@decadent.org.uk Partially revert "usb: Kconfig: using select for USB_COMMON dependency"
-------------
Diffstat:
Makefile | 4 +-- arch/arm/boot/dts/omap3430-sdp.dts | 2 +- arch/arm/mach-imx/pm-imx6.c | 2 ++ arch/powerpc/boot/dts/fsl/t1023rdb.dts | 2 +- arch/x86/events/core.c | 1 + arch/xtensa/kernel/irq.c | 2 +- drivers/gpu/drm/nouveau/nouveau_debugfs.c | 1 + drivers/hid/hid-apple.c | 7 +++++ drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- drivers/net/phy/mdio_bus.c | 7 +++++ drivers/ptp/ptp_pch.c | 1 + drivers/scsi/ses.c | 2 +- drivers/scsi/virtio_scsi.c | 4 +-- drivers/usb/Kconfig | 3 +- drivers/usb/class/cdc-acm.c | 8 +++++ fs/nfsd/nfs4xdr.c | 19 +++++++----- fs/overlayfs/dir.c | 10 ++++-- kernel/bpf/stackmap.c | 3 +- mm/gup.c | 48 ++++++++++++++++++++++++----- mm/huge_memory.c | 7 ++--- net/bridge/br_netlink.c | 2 +- net/core/rtnetlink.c | 2 +- net/ipv6/netfilter/ip6_tables.c | 1 + net/mac80211/rx.c | 3 +- net/netlink/af_netlink.c | 14 ++++++--- net/sched/sch_fifo.c | 3 ++ 26 files changed, 118 insertions(+), 42 deletions(-)
From: Ben Hutchings ben@decadent.org.uk
commit 4d1aa9112c8e6995ef2c8a76972c9671332ccfea upstream.
This reverts commit cb9c1cfc86926d0e86d19c8e34f6c23458cd3478 for USB_LED_TRIG. This config symbol has bool type and enables extra code in usb_common itself, not a separate driver. Enabling it should not force usb_common to be built-in!
Fixes: cb9c1cfc8692 ("usb: Kconfig: using select for USB_COMMON dependency") Cc: stable stable@vger.kernel.org Signed-off-by: Ben Hutchings ben@decadent.org.uk Signed-off-by: Salvatore Bonaccorso carnil@debian.org Link: https://lore.kernel.org/r/20210921143442.340087-1-carnil@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/usb/Kconfig +++ b/drivers/usb/Kconfig @@ -160,8 +160,7 @@ source "drivers/usb/gadget/Kconfig"
config USB_LED_TRIG bool "USB LED Triggers" - depends on LEDS_CLASS && LEDS_TRIGGERS - select USB_COMMON + depends on LEDS_CLASS && USB_COMMON && LEDS_TRIGGERS help This option adds LED triggers for USB host and/or gadget activity.
From: Johan Hovold johan@kernel.org
commit 65a205e6113506e69a503b61d97efec43fc10fd7 upstream.
A recent change that started reporting break events to the line discipline caused the tty-buffer insertions to no longer be serialised by inserting events also from the completion handler for the interrupt endpoint.
Completion calls for distinct endpoints are not guaranteed to be serialised. For example, in case a host-controller driver uses bottom-half completion, the interrupt and bulk-in completion handlers can end up running in parallel on two CPUs (high-and low-prio tasklets, respectively) thereby breaking the tty layer's single producer assumption.
Fix this by holding the read lock also when inserting characters from the bulk endpoint.
Fixes: 08dff274edda ("cdc-acm: fix BREAK rx code path adding necessary calls") Cc: stable@vger.kernel.org Acked-by: Oliver Neukum oneukum@suse.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210929090937.7410-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/cdc-acm.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -408,11 +408,16 @@ static int acm_submit_read_urbs(struct a
static void acm_process_read_urb(struct acm *acm, struct urb *urb) { + unsigned long flags; + if (!urb->actual_length) return;
+ spin_lock_irqsave(&acm->read_lock, flags); tty_insert_flip_string(&acm->port, urb->transfer_buffer, urb->actual_length); + spin_unlock_irqrestore(&acm->read_lock, flags); + tty_flip_buffer_push(&acm->port); }
From: Johan Hovold johan@kernel.org
commit 58fc1daa4d2e9789b9ffc880907c961ea7c062cc upstream.
A recent change that started reporting break events forgot to push the event to the line discipline, which meant that a detected break would not be reported until further characters had been receive (the port could even have been closed and reopened in between).
Fixes: 08dff274edda ("cdc-acm: fix BREAK rx code path adding necessary calls") Cc: stable@vger.kernel.org Acked-by: Oliver Neukum oneukum@suse.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20210929090937.7410-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/cdc-acm.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -349,6 +349,9 @@ static void acm_ctrl_irq(struct urb *urb acm->iocount.overrun++; spin_unlock(&acm->read_lock);
+ if (newctrl & ACM_CTRL_BRK) + tty_flip_buffer_push(&acm->port); + if (difference) wake_up_all(&acm->wioctl);
From: Zheng Liang zhengliang6@huawei.com
commit a295aef603e109a47af355477326bd41151765b6 upstream.
The following reproducer
mkdir lower upper work merge touch lower/old touch lower/new mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge rm merge/new mv merge/old merge/new & unlink upper/new
may result in this race:
PROCESS A: rename("merge/old", "merge/new"); overwrite=true,ovl_lower_positive(old)=true, ovl_dentry_is_whiteout(new)=true -> flags |= RENAME_EXCHANGE
PROCESS B: unlink("upper/new");
PROCESS A: lookup newdentry in new_upperdir call vfs_rename() with negative newdentry and RENAME_EXCHANGE
Fix by adding the missing check for negative newdentry.
Signed-off-by: Zheng Liang zhengliang6@huawei.com Fixes: e9be9d5e76e3 ("overlay filesystem") Cc: stable@vger.kernel.org # v3.18 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/dir.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -926,9 +926,13 @@ static int ovl_rename2(struct inode *old goto out_dput; } } else { - if (!d_is_negative(newdentry) && - (!new_opaque || !ovl_is_whiteout(newdentry))) - goto out_dput; + if (!d_is_negative(newdentry)) { + if (!new_opaque || !ovl_is_whiteout(newdentry)) + goto out_dput; + } else { + if (flags & RENAME_EXCHANGE) + goto out_dput; + } }
if (olddentry == trap)
From: Trond Myklebust trond.myklebust@hammerspace.com
commit f2e717d655040d632c9015f19aa4275f8b16e7f2 upstream.
RFC3530 notes that the 'dircount' field may be zero, in which case the recommendation is to ignore it, and only enforce the 'maxcount' field. In RFC5661, this recommendation to ignore a zero valued field becomes a requirement.
Fixes: aee377644146 ("nfsd4: fix rd_dircount enforcement") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4xdr.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
--- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3028,15 +3028,18 @@ nfsd4_encode_dirent(void *ccdv, const ch goto fail; cd->rd_maxcount -= entry_bytes; /* - * RFC 3530 14.2.24 describes rd_dircount as only a "hint", so - * let's always let through the first entry, at least: + * RFC 3530 14.2.24 describes rd_dircount as only a "hint", and + * notes that it could be zero. If it is zero, then the server + * should enforce only the rd_maxcount value. */ - if (!cd->rd_dircount) - goto fail; - name_and_cookie = 4 + 4 * XDR_QUADLEN(namlen) + 8; - if (name_and_cookie > cd->rd_dircount && cd->cookie_offset) - goto fail; - cd->rd_dircount -= min(cd->rd_dircount, name_and_cookie); + if (cd->rd_dircount) { + name_and_cookie = 4 + 4 * XDR_QUADLEN(namlen) + 8; + if (name_and_cookie > cd->rd_dircount && cd->cookie_offset) + goto fail; + cd->rd_dircount -= min(cd->rd_dircount, name_and_cookie); + if (!cd->rd_dircount) + cd->rd_maxcount = 0; + }
cd->cookie_offset = cookie_offset; skip_entry:
From: Roger Quadros rogerq@kernel.org
commit 80d680fdccba214e8106dc1aa33de5207ad75394 upstream.
Nand is on CS1 so reg properties first field should be 1 not 0.
Fixes: 44e4716499b8 ("ARM: dts: omap3: Fix NAND device nodes") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Roger Quadros rogerq@kernel.org Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/boot/dts/omap3430-sdp.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/boot/dts/omap3430-sdp.dts +++ b/arch/arm/boot/dts/omap3430-sdp.dts @@ -104,7 +104,7 @@
nand@1,0 { compatible = "ti,omap2-nand"; - reg = <0 0 4>; /* CS0, offset 0, IO size 4 */ + reg = <1 0 4>; /* CS1, offset 0, IO size 4 */ interrupt-parent = <&gpmc>; interrupts = <0 IRQ_TYPE_NONE>, /* fifoevent */ <1 IRQ_TYPE_NONE>; /* termcount */
From: Max Filippov jcmvbkbc@gmail.com
[ Upstream commit 6489f8d0e1d93a3603d8dad8125797559e4cf2a2 ]
During boot time kernel configured with OF=y but USE_OF=n displays the following warnings and hangs shortly after starting userspace:
------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/irq/irqdomain.c:695 irq_create_mapping_affinity+0x29/0xc0 irq_create_mapping_affinity(, 6) called with NULL domain CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc3-00001-gd67ed2510d28 #30 Call Trace: __warn+0x69/0xc4 warn_slowpath_fmt+0x6c/0x94 irq_create_mapping_affinity+0x29/0xc0 local_timer_setup+0x40/0x88 time_init+0xb1/0xe8 start_kernel+0x31d/0x3f4 _startup+0x13b/0x13b ---[ end trace 1e6630e1c5eda35b ]--- ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at arch/xtensa/kernel/time.c:141 local_timer_setup+0x58/0x88 error: can't map timer irq CPU: 0 PID: 0 Comm: swapper Tainted: G W 5.15.0-rc3-00001-gd67ed2510d28 #30 Call Trace: __warn+0x69/0xc4 warn_slowpath_fmt+0x6c/0x94 local_timer_setup+0x58/0x88 time_init+0xb1/0xe8 start_kernel+0x31d/0x3f4 _startup+0x13b/0x13b ---[ end trace 1e6630e1c5eda35c ]--- Failed to request irq 0 (timer)
Fix that by calling irqchip_init only when CONFIG_USE_OF is selected and calling legacy interrupt controller init otherwise.
Fixes: da844a81779e ("xtensa: add device trees support") Signed-off-by: Max Filippov jcmvbkbc@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/xtensa/kernel/irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/xtensa/kernel/irq.c b/arch/xtensa/kernel/irq.c index 441694464b1e..fbbc24b914e3 100644 --- a/arch/xtensa/kernel/irq.c +++ b/arch/xtensa/kernel/irq.c @@ -144,7 +144,7 @@ unsigned xtensa_get_ext_irq_no(unsigned irq)
void __init init_IRQ(void) { -#ifdef CONFIG_OF +#ifdef CONFIG_USE_OF irqchip_init(); #else #ifdef CONFIG_HAVE_SMP
From: Tatsuhiko Yasumatsu th.yasumatsu@gmail.com
[ Upstream commit 30e29a9a2bc6a4888335a6ede968b75cd329657a ]
In prealloc_elems_and_freelist(), the multiplication to calculate the size passed to bpf_map_area_alloc() could lead to an integer overflow. As a result, out-of-bounds write could occur in pcpu_freelist_populate() as reported by KASAN:
[...] [ 16.968613] BUG: KASAN: slab-out-of-bounds in pcpu_freelist_populate+0xd9/0x100 [ 16.969408] Write of size 8 at addr ffff888104fc6ea0 by task crash/78 [ 16.970038] [ 16.970195] CPU: 0 PID: 78 Comm: crash Not tainted 5.15.0-rc2+ #1 [ 16.970878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 16.972026] Call Trace: [ 16.972306] dump_stack_lvl+0x34/0x44 [ 16.972687] print_address_description.constprop.0+0x21/0x140 [ 16.973297] ? pcpu_freelist_populate+0xd9/0x100 [ 16.973777] ? pcpu_freelist_populate+0xd9/0x100 [ 16.974257] kasan_report.cold+0x7f/0x11b [ 16.974681] ? pcpu_freelist_populate+0xd9/0x100 [ 16.975190] pcpu_freelist_populate+0xd9/0x100 [ 16.975669] stack_map_alloc+0x209/0x2a0 [ 16.976106] __sys_bpf+0xd83/0x2ce0 [...]
The possibility of this overflow was originally discussed in [0], but was overlooked.
Fix the integer overflow by changing elem_size to u64 from u32.
[0] https://lore.kernel.org/bpf/728b238e-a481-eb50-98e9-b0f430ab01e7@gmail.com/
Fixes: 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation") Signed-off-by: Tatsuhiko Yasumatsu th.yasumatsu@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20210930135545.173698-1-th.yasumatsu@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/stackmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 2fdf6f96f976..6f09728cd1dd 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -28,7 +28,8 @@ struct bpf_stack_map {
static int prealloc_elems_and_freelist(struct bpf_stack_map *smap) { - u32 elem_size = sizeof(struct stack_map_bucket) + smap->map.value_size; + u64 elem_size = sizeof(struct stack_map_bucket) + + (u64)smap->map.value_size; int err;
smap->elems = bpf_map_area_alloc(elem_size * smap->map.max_entries);
From: Pavel Skripkin paskripkin@gmail.com
[ Upstream commit ca6e11c337daf7925ff8a2aac8e84490a8691905 ]
Syzbot reported memory leak in MDIO bus interface, the problem was in wrong state logic.
MDIOBUS_ALLOCATED indicates 2 states: 1. Bus is only allocated 2. Bus allocated and __mdiobus_register() fails, but device_register() was called
In case of device_register() has been called we should call put_device() to correctly free the memory allocated for this device, but mdiobus_free() calls just kfree(dev) in case of MDIOBUS_ALLOCATED state
To avoid this behaviour we need to set bus->state to MDIOBUS_UNREGISTERED _before_ calling device_register(), because put_device() should be called even in case of device_register() failure.
Link: https://lore.kernel.org/netdev/YVMRWNDZDUOvQjHL@shell.armlinux.org.uk/ Fixes: 46abc02175b3 ("phylib: give mdio buses a device tree presence") Reported-and-tested-by: syzbot+398e7dc692ddbbb4cfec@syzkaller.appspotmail.com Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Pavel Skripkin paskripkin@gmail.com Link: https://lore.kernel.org/r/eceae1429fbf8fa5c73dd2a0d39d525aa905074d.163302406... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/mdio_bus.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 8cc7563ab103..92fb664b56fb 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -316,6 +316,13 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner) bus->dev.groups = NULL; dev_set_name(&bus->dev, "%s", bus->id);
+ /* We need to set state to MDIOBUS_UNREGISTERED to correctly release + * the device in mdiobus_free() + * + * State will be updated later in this function in case of success + */ + bus->state = MDIOBUS_UNREGISTERED; + err = device_register(&bus->dev); if (err) { pr_err("mii_bus %s failed to register\n", bus->id);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 560ee196fe9e5037e5015e2cdb14b3aecb1cd7dc ]
syzbot reported another NULL deref in fifo_set_limit() [1]
I could repro the issue with :
unshare -n tc qd add dev lo root handle 1:0 tbf limit 200000 burst 70000 rate 100Mbit tc qd replace dev lo parent 1:0 pfifo_fast tc qd change dev lo root handle 1:0 tbf limit 300000 burst 70000 rate 100Mbit
pfifo_fast does not have a change() operation. Make fifo_set_limit() more robust about this.
[1] BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 1cf99067 P4D 1cf99067 PUD 7ca49067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 1 PID: 14443 Comm: syz-executor959 Not tainted 5.15.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffffc9000e2f7310 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffffffff8d6ecc00 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff888024c27910 RDI: ffff888071e34000 RBP: ffff888071e34000 R08: 0000000000000001 R09: ffffffff8fcfb947 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888024c27910 R13: ffff888071e34018 R14: 0000000000000000 R15: ffff88801ef74800 FS: 00007f321d897700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000722c3000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: fifo_set_limit net/sched/sch_fifo.c:242 [inline] fifo_set_limit+0x198/0x210 net/sched/sch_fifo.c:227 tbf_change+0x6ec/0x16d0 net/sched/sch_tbf.c:418 qdisc_change net/sched/sch_api.c:1332 [inline] tc_modify_qdisc+0xd9a/0x1a60 net/sched/sch_api.c:1634 rtnetlink_rcv_msg+0x413/0xb80 net/core/rtnetlink.c:5572 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504 netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340 netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: fb0305ce1b03 ("net-sched: consolidate default fifo qdisc setup") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Link: https://lore.kernel.org/r/20210930212239.3430364-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_fifo.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c index 1e37247656f8..8b7110cbcce4 100644 --- a/net/sched/sch_fifo.c +++ b/net/sched/sch_fifo.c @@ -151,6 +151,9 @@ int fifo_set_limit(struct Qdisc *q, unsigned int limit) if (strncmp(q->ops->id + 1, "fifo", 4) != 0) return 0;
+ if (!q->ops->change) + return 0; + nla = kmalloc(nla_attr_size(sizeof(struct tc_fifo_qopt)), GFP_KERNEL); if (nla) { nla->nla_type = RTM_NEWQDISC;
From: Pali Rohár pali@kernel.org
[ Upstream commit eed183abc0d3b8adb64fd1363b7cea7986cd58d6 ]
Property phy-connection-type contains invalid value "sgmii-2500" per scheme defined in file ethernet-controller.yaml.
Correct phy-connection-type value should be "2500base-x".
Signed-off-by: Pali Rohár pali@kernel.org Fixes: 84e0f1c13806 ("powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)") Acked-by: Scott Wood oss@buserror.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/dts/fsl/t1023rdb.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/boot/dts/fsl/t1023rdb.dts b/arch/powerpc/boot/dts/fsl/t1023rdb.dts index 29757623e5ba..f5f8f969dd58 100644 --- a/arch/powerpc/boot/dts/fsl/t1023rdb.dts +++ b/arch/powerpc/boot/dts/fsl/t1023rdb.dts @@ -125,7 +125,7 @@
fm1mac3: ethernet@e4000 { phy-handle = <&sgmii_aqr_phy3>; - phy-connection-type = "sgmii-2500"; + phy-connection-type = "2500base-x"; sleep = <&rcpm 0x20000000>; };
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 7cd8b1542a7ba0720c5a0a85ed414a122015228b ]
The driver can't be loaded automatically because it misses module alias to be provided. Add corresponding MODULE_DEVICE_TABLE() call to the driver.
Fixes: 863d08ece9bf ("supports eg20t ptp clock") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_pch.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ptp/ptp_pch.c b/drivers/ptp/ptp_pch.c index 3aa22ae4d94c..a911325fc0b4 100644 --- a/drivers/ptp/ptp_pch.c +++ b/drivers/ptp/ptp_pch.c @@ -698,6 +698,7 @@ static const struct pci_device_id pch_ieee1588_pcidev_id[] = { }, {0} }; +MODULE_DEVICE_TABLE(pci, pch_ieee1588_pcidev_id);
static struct pci_driver pch_driver = { .name = KBUILD_MODNAME,
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 783f3db030563f7bcdfe2d26428af98ea1699a8e ]
Any pending interrupt can prevent entering standby based power off state. To avoid it, disable the GIC CPU interface.
Fixes: 8148d2136002 ("ARM: imx6: register pm_power_off handler if "fsl,pmic-stby-poweroff" is set") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-imx/pm-imx6.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm/mach-imx/pm-imx6.c b/arch/arm/mach-imx/pm-imx6.c index 6da26692f2fd..950c9f2ffe00 100644 --- a/arch/arm/mach-imx/pm-imx6.c +++ b/arch/arm/mach-imx/pm-imx6.c @@ -15,6 +15,7 @@ #include <linux/io.h> #include <linux/irq.h> #include <linux/genalloc.h> +#include <linux/irqchip/arm-gic.h> #include <linux/mfd/syscon.h> #include <linux/mfd/syscon/imx6q-iomuxc-gpr.h> #include <linux/of.h> @@ -606,6 +607,7 @@ static void __init imx6_pm_common_init(const struct imx6_pm_socdata
static void imx6_pm_stby_poweroff(void) { + gic_cpu_if_down(0); imx6_set_lpm(STOP_POWER_OFF); imx6q_suspend_finish(0);
From: Eric Dumazet edumazet@google.com
[ Upstream commit dbe0b88064494b7bb6a9b2aa7e085b14a3112d44 ]
bridge_fill_linkxstats() is using nla_reserve_64bit().
We must use nla_total_size_64bit() instead of nla_total_size() for corresponding data structure.
Fixes: 1080ab95e3c7 ("net: bridge: add support for IGMP/MLD stats and export them via netlink") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Nikolay Aleksandrov nikolay@nvidia.com Cc: Vivien Didelot vivien.didelot@gmail.com Acked-by: Nikolay Aleksandrov nikolay@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_netlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 4f831225d34f..ca8757090ae3 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1298,7 +1298,7 @@ static size_t br_get_linkxstats_size(const struct net_device *dev, int attr) }
return numvls * nla_total_size(sizeof(struct bridge_vlan_xstats)) + - nla_total_size(sizeof(struct br_mcast_stats)) + + nla_total_size_64bit(sizeof(struct br_mcast_stats)) + nla_total_size(0); }
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7707a4d01a648e4c655101a469c956cb11273655 ]
While existing code is correct, KCSAN is reporting a data-race in netlink_insert / netlink_sendmsg [1]
It is correct to read nlk->bound without a lock, as netlink_autobind() will acquire all needed locks.
[1] BUG: KCSAN: data-race in netlink_insert / netlink_sendmsg
write to 0xffff8881031c8b30 of 1 bytes by task 18752 on cpu 0: netlink_insert+0x5cc/0x7f0 net/netlink/af_netlink.c:597 netlink_autobind+0xa9/0x150 net/netlink/af_netlink.c:842 netlink_sendmsg+0x479/0x7c0 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:703 [inline] sock_sendmsg net/socket.c:723 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392 ___sys_sendmsg net/socket.c:2446 [inline] __sys_sendmsg+0x1ed/0x270 net/socket.c:2475 __do_sys_sendmsg net/socket.c:2484 [inline] __se_sys_sendmsg net/socket.c:2482 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2482 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff8881031c8b30 of 1 bytes by task 18751 on cpu 1: netlink_sendmsg+0x270/0x7c0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:703 [inline] sock_sendmsg net/socket.c:723 [inline] __sys_sendto+0x2a8/0x370 net/socket.c:2019 __do_sys_sendto net/socket.c:2031 [inline] __se_sys_sendto net/socket.c:2027 [inline] __x64_sys_sendto+0x74/0x90 net/socket.c:2027 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00 -> 0x01
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 18751 Comm: syz-executor.0 Not tainted 5.14.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: da314c9923fe ("netlink: Replace rhash_portid with bound") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/netlink/af_netlink.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 453b0efdc0d7..1b70de5898c4 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -574,7 +574,10 @@ static int netlink_insert(struct sock *sk, u32 portid)
/* We need to ensure that the socket is hashed and visible. */ smp_wmb(); - nlk_sk(sk)->bound = portid; + /* Paired with lockless reads from netlink_bind(), + * netlink_connect() and netlink_sendmsg(). + */ + WRITE_ONCE(nlk_sk(sk)->bound, portid);
err: release_sock(sk); @@ -993,7 +996,8 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr, else if (nlk->ngroups < 8*sizeof(groups)) groups &= (1UL << nlk->ngroups) - 1;
- bound = nlk->bound; + /* Paired with WRITE_ONCE() in netlink_insert() */ + bound = READ_ONCE(nlk->bound); if (bound) { /* Ensure nlk->portid is up-to-date. */ smp_rmb(); @@ -1073,8 +1077,9 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr,
/* No need for barriers here as we return to user-space without * using any of the bound attributes. + * Paired with WRITE_ONCE() in netlink_insert(). */ - if (!nlk->bound) + if (!READ_ONCE(nlk->bound)) err = netlink_autobind(sock);
if (err == 0) { @@ -1821,7 +1826,8 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) dst_group = nlk->dst_group; }
- if (!nlk->bound) { + /* Paired with WRITE_ONCE() in netlink_insert() */ + if (!READ_ONCE(nlk->bound)) { err = netlink_autobind(sock); if (err) goto out;
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit f5a8703a9c418c6fc54eb772712dfe7641e3991c ]
When using single_open() for opening, single_release() should be called, otherwise the 'op' allocated in single_open() will be leaked.
Fixes: 6e9fc177399f ("drm/nouveau/debugfs: add copy of sysfs pstate interface ported to debugfs") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Karol Herbst kherbst@redhat.com Signed-off-by: Karol Herbst kherbst@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20210911075023.3969054-2-yangy... Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_debugfs.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/nouveau/nouveau_debugfs.c b/drivers/gpu/drm/nouveau/nouveau_debugfs.c index 411c12cdb249..bb516eb12421 100644 --- a/drivers/gpu/drm/nouveau/nouveau_debugfs.c +++ b/drivers/gpu/drm/nouveau/nouveau_debugfs.c @@ -178,6 +178,7 @@ static const struct file_operations nouveau_pstate_fops = { .open = nouveau_debugfs_pstate_open, .read = seq_read, .write = nouveau_debugfs_pstate_set, + .release = single_release, };
static struct drm_info_list nouveau_debugfs_list[] = {
From: Eric Dumazet edumazet@google.com
[ Upstream commit d34367991933d28bd7331f67a759be9a8c474014 ]
rtnl_fill_statsinfo() is filling skb with one mandatory if_stats_msg structure.
nlmsg_put(skb, pid, seq, type, sizeof(struct if_stats_msg), flags);
But if_nlmsg_stats_size() never considered the needed storage.
This bug did not show up because alloc_skb(X) allocates skb with extra tailroom, because of added alignments. This could very well be changed in the future to have deterministic behavior.
Fixes: 10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump link stats") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Roopa Prabhu roopa@nvidia.com Acked-by: Roopa Prabhu roopa@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/rtnetlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 911752e8a3e6..012143f313a8 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -3900,7 +3900,7 @@ nla_put_failure: static size_t if_nlmsg_stats_size(const struct net_device *dev, u32 filter_mask) { - size_t size = 0; + size_t size = NLMSG_ALIGN(sizeof(struct if_stats_msg));
if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_64, 0)) size += nla_total_size_64bit(sizeof(struct rtnl_link_stats64));
From: Jiri Benc jbenc@redhat.com
[ Upstream commit 857b6c6f665cca9828396d9743faf37fd09e9ac3 ]
The loop in i40e_get_capabilities can never end. The problem is that although i40e_aq_discover_capabilities returns with an error if there's a firmware problem, the returned error is not checked. There is a check for pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most firmware problems.
When i40e_aq_discover_capabilities encounters a firmware problem, it will encounter the same problem on its next invocation. As the result, the loop becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking at the code, it can happen with a range of other firmware errors.
I don't know what the correct behavior should be: whether the firmware should be retried a few times, or whether pf->hw.aq.asq_last_status should be always set to the encountered firmware error (but then it would be pointless and can be just replaced by the i40e_aq_discover_capabilities return value). However, the current behavior with an endless loop under the rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we explained the bug to them 7 months ago.
This may not be the best possible fix but it's better than hanging the whole system on a firmware bug.
Fixes: 56a62fc86895 ("i40e: init code and hardware support") Tested-by: Stefan Assmann sassmann@redhat.com Signed-off-by: Jiri Benc jbenc@redhat.com Reviewed-by: Jesse Brandeburg jesse.brandeburg@intel.com Tested-by: Dave Switzer david.switzer@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 832fffed4a1f..e7585f6c4665 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -6646,7 +6646,7 @@ static int i40e_get_capabilities(struct i40e_pf *pf) if (pf->hw.aq.asq_last_status == I40E_AQ_RC_ENOMEM) { /* retry with a larger buffer */ buf_len = data_size; - } else if (pf->hw.aq.asq_last_status != I40E_AQ_RC_OK) { + } else if (pf->hw.aq.asq_last_status != I40E_AQ_RC_OK || err) { dev_info(&pf->pdev->dev, "capability discovery failed, err %s aq_err %s\n", i40e_stat_str(&pf->hw, err),
From: Linus Torvalds torvalds@linux-foundation.org
commit 17839856fd588f4ab6b789f482ed3ffd7c403e1f upstream.
Doing a "get_user_pages()" on a copy-on-write page for reading can be ambiguous: the page can be COW'ed at any time afterwards, and the direction of a COW event isn't defined.
Yes, whoever writes to it will generally do the COW, but if the thread that did the get_user_pages() unmapped the page before the write (and that could happen due to memory pressure in addition to any outright action), the writer could also just take over the old page instead.
End result: the get_user_pages() call might result in a page pointer that is no longer associated with the original VM, and is associated with - and controlled by - another VM having taken it over instead.
So when doing a get_user_pages() on a COW mapping, the only really safe thing to do would be to break the COW when getting the page, even when only getting it for reading.
At the same time, some users simply don't even care.
For example, the perf code wants to look up the page not because it cares about the page, but because the code simply wants to look up the physical address of the access for informational purposes, and doesn't really care about races when a page might be unmapped and remapped elsewhere.
This adds logic to force a COW event by setting FOLL_WRITE on any copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page pointer as a result.
The current semantics end up being:
- __get_user_pages_fast(): no change. If you don't ask for a write, you won't break COW. You'd better know what you're doing.
- get_user_pages_fast(): the fast-case "look it up in the page tables without anything getting mmap_sem" now refuses to follow a read-only page, since it might need COW breaking. Which happens in the slow path - the fast path doesn't know if the memory might be COW or not.
- get_user_pages() (including the slow-path fallback for gup_fast()): for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with very similar semantics to FOLL_FORCE.
If it turns out that we want finer granularity (ie "only break COW when it might actually matter" - things like the zero page are special and don't need to be broken) we might need to push these semantics deeper into the lookup fault path. So if people care enough, it's possible that we might end up adding a new internal FOLL_BREAK_COW flag to go with the internal FOLL_COW flag we already have for tracking "I had a COW".
Alternatively, if it turns out that different callers might want to explicitly control the forced COW break behavior, we might even want to make such a flag visible to the users of get_user_pages() instead of using the above default semantics.
But for now, this is mostly commentary on the issue (this commit message being a lot bigger than the patch, and that patch in turn is almost all comments), with that minimal "enable COW breaking early" logic using the existing FOLL_WRITE behavior.
[ It might be worth noting that we've always had this ambiguity, and it could arguably be seen as a user-space issue.
You only get private COW mappings that could break either way in situations where user space is doing cooperative things (ie fork() before an execve() etc), but it _is_ surprising and very subtle, and fork() is supposed to give you independent address spaces.
So let's treat this as a kernel issue and make the semantics of get_user_pages() easier to understand. Note that obviously a true shared mapping will still get a page that can change under us, so this does _not_ mean that get_user_pages() somehow returns any "stable" page ]
[surenb: backport notes Since gup_pgd_range does not exist, made appropriate changes on the the gup_huge_pgd, gup_huge_pd and gup_pud_range calls instead. Replaced (gup_flags | FOLL_WRITE) with write=1 in gup_huge_pgd, gup_huge_pd and gup_pud_range. Removed FOLL_PIN usage in should_force_cow_break since it's missing in the earlier kernels.]
Reported-by: Jann Horn jannh@google.com Tested-by: Christoph Hellwig hch@lst.de Acked-by: Oleg Nesterov oleg@redhat.com Acked-by: Kirill Shutemov kirill@shutemov.name Acked-by: Jan Kara jack@suse.cz Cc: Andrea Arcangeli aarcange@redhat.com Cc: Matthew Wilcox willy@infradead.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [surenb: backport to 4.9 kernel] Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/gup.c | 48 ++++++++++++++++++++++++++++++++++++++++-------- mm/huge_memory.c | 7 +++---- 2 files changed, 43 insertions(+), 12 deletions(-)
--- a/mm/gup.c +++ b/mm/gup.c @@ -61,13 +61,22 @@ static int follow_pfn_pte(struct vm_area }
/* - * FOLL_FORCE can write to even unwritable pte's, but only - * after we've gone through a COW cycle and they are dirty. + * FOLL_FORCE or a forced COW break can write even to unwritable pte's, + * but only after we've gone through a COW cycle and they are dirty. */ static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) { - return pte_write(pte) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); + return pte_write(pte) || ((flags & FOLL_COW) && pte_dirty(pte)); +} + +/* + * A (separate) COW fault might break the page the other way and + * get_user_pages() would return the page from what is now the wrong + * VM. So we need to force a COW break at GUP time even for reads. + */ +static inline bool should_force_cow_break(struct vm_area_struct *vma, unsigned int flags) +{ + return is_cow_mapping(vma->vm_flags) && (flags & FOLL_GET); }
static struct page *follow_page_pte(struct vm_area_struct *vma, @@ -577,12 +586,18 @@ static long __get_user_pages(struct task if (!vma || check_vma_flags(vma, gup_flags)) return i ? : -EFAULT; if (is_vm_hugetlb_page(vma)) { + if (should_force_cow_break(vma, foll_flags)) + foll_flags |= FOLL_WRITE; i = follow_hugetlb_page(mm, vma, pages, vmas, &start, &nr_pages, i, - gup_flags); + foll_flags); continue; } } + + if (should_force_cow_break(vma, foll_flags)) + foll_flags |= FOLL_WRITE; + retry: /* * If we have a pending SIGKILL, don't keep faulting pages and @@ -1503,6 +1518,10 @@ static int gup_pud_range(pgd_t pgd, unsi /* * Like get_user_pages_fast() except it's IRQ-safe in that it won't fall back to * the regular GUP. It will only return non-negative values. + * + * Careful, careful! COW breaking can go either way, so a non-write + * access can get ambiguous page results. If you call this function without + * 'write' set, you'd better be sure that you're ok with that ambiguity. */ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, struct page **pages) @@ -1532,6 +1551,12 @@ int __get_user_pages_fast(unsigned long * * We do not adopt an rcu_read_lock(.) here as we also want to * block IPIs that come from THPs splitting. + * + * NOTE! We allow read-only gup_fast() here, but you'd better be + * careful about possible COW pages. You'll get _a_ COW page, but + * not necessarily the one you intended to get depending on what + * COW event happens after this. COW may break the page copy in a + * random direction. */
local_irq_save(flags); @@ -1542,15 +1567,22 @@ int __get_user_pages_fast(unsigned long next = pgd_addr_end(addr, end); if (pgd_none(pgd)) break; + /* + * The FAST_GUP case requires FOLL_WRITE even for pure reads, + * because get_user_pages() may need to cause an early COW in + * order to avoid confusing the normal COW routines. So only + * targets that are already writable are safe to do by just + * looking at the page tables. + */ if (unlikely(pgd_huge(pgd))) { - if (!gup_huge_pgd(pgd, pgdp, addr, next, write, + if (!gup_huge_pgd(pgd, pgdp, addr, next, 1, pages, &nr)) break; } else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) { if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr, - PGDIR_SHIFT, next, write, pages, &nr)) + PGDIR_SHIFT, next, 1, pages, &nr)) break; - } else if (!gup_pud_range(pgd, addr, next, write, pages, &nr)) + } else if (!gup_pud_range(pgd, addr, next, 1, pages, &nr)) break; } while (pgdp++, addr = next, addr != end); local_irq_restore(flags); --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1135,13 +1135,12 @@ out_unlock: }
/* - * FOLL_FORCE can write to even unwritable pmd's, but only - * after we've gone through a COW cycle and they are dirty. + * FOLL_FORCE or a forced COW break can write even to unwritable pmd's, + * but only after we've gone through a COW cycle and they are dirty. */ static inline bool can_follow_write_pmd(pmd_t pmd, unsigned int flags) { - return pmd_write(pmd) || - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pmd_dirty(pmd)); + return pmd_write(pmd) || ((flags & FOLL_COW) && pmd_dirty(pmd)); }
struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
From: Mizuho Mori morimolymoly@gmail.com
[ Upstream commit 67fd71ba16a37c663d139f5ba5296f344d80d072 ]
Apple Magic Keyboard(JIS)'s Logical Maximum and Usage Maximum are wrong.
Below is a report descriptor.
0x05, 0x01, /* Usage Page (Desktop), */ 0x09, 0x06, /* Usage (Keyboard), */ 0xA1, 0x01, /* Collection (Application), */ 0x85, 0x01, /* Report ID (1), */ 0x05, 0x07, /* Usage Page (Keyboard), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x25, 0x01, /* Logical Maximum (1), */ 0x19, 0xE0, /* Usage Minimum (KB Leftcontrol), */ 0x29, 0xE7, /* Usage Maximum (KB Right GUI), */ 0x75, 0x01, /* Report Size (1), */ 0x95, 0x08, /* Report Count (8), */ 0x81, 0x02, /* Input (Variable), */ 0x95, 0x05, /* Report Count (5), */ 0x75, 0x01, /* Report Size (1), */ 0x05, 0x08, /* Usage Page (LED), */ 0x19, 0x01, /* Usage Minimum (01h), */ 0x29, 0x05, /* Usage Maximum (05h), */ 0x91, 0x02, /* Output (Variable), */ 0x95, 0x01, /* Report Count (1), */ 0x75, 0x03, /* Report Size (3), */ 0x91, 0x03, /* Output (Constant, Variable), */ 0x95, 0x08, /* Report Count (8), */ 0x75, 0x01, /* Report Size (1), */ 0x15, 0x00, /* Logical Minimum (0), */ 0x25, 0x01, /* Logical Maximum (1), */
here is a report descriptor which is parsed one in kernel. see sys/kernel/debug/hid/<dev>/rdesc
05 01 09 06 a1 01 85 01 05 07 15 00 25 01 19 e0 29 e7 75 01 95 08 81 02 95 05 75 01 05 08 19 01 29 05 91 02 95 01 75 03 91 03 95 08 75 01 15 00 25 01 06 00 ff 09 03 81 03 95 06 75 08 15 00 25 [65] 05 07 19 00 29 [65] 81 00 95 01 75 01 15 00 25 01 05 0c 09 b8 81 02 95 01 75 01 06 01 ff 09 03 81 02 95 01 75 06 81 03 06 02 ff 09 55 85 55 15 00 26 ff 00 75 08 95 40 b1 a2 c0 06 00 ff 09 14 a1 01 85 90 05 84 75 01 95 03 15 00 25 01 09 61 05 85 09 44 09 46 81 02 95 05 81 01 75 08 95 01 15 00 26 ff 00 09 65 81 02 c0 00
Position 64(Logical Maximum) and 70(Usage Maximum) are 101. Both should be 0xE7 to support JIS specific keys(ろ, Eisu, Kana, |) support. position 117 is also 101 but not related(it is Usage 65h).
There are no difference of product id between JIS and ANSI. They are same 0x0267.
Signed-off-by: Mizuho Mori morimolymoly@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-apple.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c index 959a9e38b4f5..149902619cbc 100644 --- a/drivers/hid/hid-apple.c +++ b/drivers/hid/hid-apple.c @@ -302,12 +302,19 @@ static int apple_event(struct hid_device *hdev, struct hid_field *field,
/* * MacBook JIS keyboard has wrong logical maximum + * Magic Keyboard JIS has wrong logical maximum */ static __u8 *apple_report_fixup(struct hid_device *hdev, __u8 *rdesc, unsigned int *rsize) { struct apple_sc *asc = hid_get_drvdata(hdev);
+ if(*rsize >=71 && rdesc[70] == 0x65 && rdesc[64] == 0x65) { + hid_info(hdev, + "fixing up Magic Keyboard JIS report descriptor\n"); + rdesc[64] = rdesc[70] = 0xe7; + } + if ((asc->quirks & APPLE_RDESC_JIS) && *rsize >= 60 && rdesc[53] == 0x65 && rdesc[59] == 0x65) { hid_info(hdev,
From: Jeremy Sowden jeremy@azazel.net
[ Upstream commit 310e2d43c3ad429c1fba4b175806cf1f55ed73a6 ]
ip6tables only sets the `IP6T_F_PROTO` flag on a rule if a protocol is specified (`-p tcp`, for example). However, if the flag is not set, `ip6_packet_match` doesn't call `ipv6_find_hdr` for the skb, in which case the fragment offset is left uninitialized and a garbage value is passed to each matcher.
Signed-off-by: Jeremy Sowden jeremy@azazel.net Reviewed-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/netfilter/ip6_tables.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 579fda1bc45d..ce54e66b47a0 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -290,6 +290,7 @@ ip6t_do_table(struct sk_buff *skb, * things we don't know, ie. tcp syn flag or ports). If the * rule is also a fragment-specific rule, non-fragments won't * match it. */ + acpar.fragoff = 0; acpar.hotdrop = false; acpar.net = state->net; acpar.in = state->in;
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit a6555f844549cd190eb060daef595f94d3de1582 ]
WARNING: CPU: 1 PID: 9 at net/mac80211/sta_info.c:554 sta_info_insert_rcu+0x121/0x12a0 Modules linked in: CPU: 1 PID: 9 Comm: kworker/u8:1 Not tainted 5.14.0-rc7+ #253 Workqueue: phy3 ieee80211_iface_work RIP: 0010:sta_info_insert_rcu+0x121/0x12a0 ... Call Trace: ieee80211_ibss_finish_sta+0xbc/0x170 ieee80211_ibss_work+0x13f/0x7d0 ieee80211_iface_work+0x37a/0x500 process_one_work+0x357/0x850 worker_thread+0x41/0x4d0
If an Ad-Hoc node receives packets with invalid source MAC address, it hits a WARN_ON in sta_info_insert_check(), this can spam the log.
Signed-off-by: YueHaibing yuehaibing@huawei.com Link: https://lore.kernel.org/r/20210827144230.39944-1-yuehaibing@huawei.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/rx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index b40e71a5d795..3dc370ad23bf 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -3692,7 +3692,8 @@ static bool ieee80211_accept_frame(struct ieee80211_rx_data *rx) if (!bssid) return false; if (ether_addr_equal(sdata->vif.addr, hdr->addr2) || - ether_addr_equal(sdata->u.ibss.bssid, hdr->addr2)) + ether_addr_equal(sdata->u.ibss.bssid, hdr->addr2) || + !is_valid_ether_addr(hdr->addr2)) return false; if (ieee80211_is_beacon(hdr->frame_control)) return true;
From: Jiapeng Chong jiapeng.chong@linux.alibaba.com
[ Upstream commit dd689ed5aa905daf4ba4c99319a52aad6ea0a796 ]
Fix the following coccicheck warning:
./drivers/scsi/ses.c:137:10-16: WARNING: Unsigned expression compared with zero: result > 0.
Link: https://lore.kernel.org/r/1632477113-90378-1-git-send-email-jiapeng.chong@li... Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Jiapeng Chong jiapeng.chong@linux.alibaba.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ses.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 69046d342bc5..39396548f9b5 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -120,7 +120,7 @@ static int ses_recv_diag(struct scsi_device *sdev, int page_code, static int ses_send_diag(struct scsi_device *sdev, int page_code, void *buf, int bufflen) { - u32 result; + int result;
unsigned char cmd[] = { SEND_DIAGNOSTIC,
From: Colin Ian King colin.king@canonical.com
[ Upstream commit cced4c0ec7c06f5230a2958907a409c849762293 ]
There are a couple of spelling mistakes in pr_info and pr_err messages. Fix them.
Link: https://lore.kernel.org/r/20210924230330.143785-1-colin.king@canonical.com Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/virtio_scsi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 7ba0031d3a73..d5575869a25c 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -343,7 +343,7 @@ static void virtscsi_handle_transport_reset(struct virtio_scsi *vscsi, } break; default: - pr_info("Unsupport virtio scsi event reason %x\n", event->reason); + pr_info("Unsupported virtio scsi event reason %x\n", event->reason); } }
@@ -396,7 +396,7 @@ static void virtscsi_handle_event(struct work_struct *work) virtscsi_handle_param_change(vscsi, event); break; default: - pr_err("Unsupport virtio scsi event %x\n", event->event); + pr_err("Unsupported virtio scsi event %x\n", event->event); } virtscsi_kick_event(vscsi, event_node); }
From: Anand K Mistry amistry@google.com
[ Upstream commit 02d029a41dc986e2d5a77ecca45803857b346829 ]
perf_init_event tries multiple init callbacks and does not reset the event state between tries. When x86_pmu_event_init runs, it unconditionally sets the destroy callback to hw_perf_event_destroy. On the next init attempt after x86_pmu_event_init, in perf_try_init_event, if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy callback will be run. However, if the next init didn't set the destroy callback, hw_perf_event_destroy will be run (since the callback wasn't reset).
Looking at other pmu init functions, the common pattern is to only set the destroy callback on a successful init. Resetting the callback on failure tries to replicate that pattern.
This was discovered after commit f11dd0d80555 ("perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second) run of the perf tool after a reboot results in 0 samples being generated. The extra run of hw_perf_event_destroy results in active_events having an extra decrement on each perf run. The second run has active_events == 0 and every subsequent run has active_events < 0. When active_events == 0, the NMI handler will early-out and not record any samples.
Signed-off-by: Anand K Mistry amistry@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e32... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index c26cca506f64..c20df6a3540c 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2075,6 +2075,7 @@ static int x86_pmu_event_init(struct perf_event *event) if (err) { if (event->destroy) event->destroy(event); + event->destroy = NULL; }
if (ACCESS_ONCE(x86_pmu.attr_rdpmc))
On Thu, 14 Oct 2021 16:53:31 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.287 release. There are 25 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 16 Oct 2021 14:51:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.287-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.9: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 32 tests: 32 pass, 0 fail
Linux version: 4.9.287-rc1-g2660ee946a02 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 10/14/21 7:53 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.287 release. There are 25 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 16 Oct 2021 14:51:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.287-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB, using 32-bit ARM and 64-bit kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 10/14/21 8:53 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.287 release. There are 25 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 16 Oct 2021 14:51:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.287-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
Hello!
On 10/14/21 9:53 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.287 release. There are 25 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 16 Oct 2021 14:51:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.287-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro's test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 4.9.287-rc1 * git: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git', 'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc'] * git branch: linux-4.9.y * git commit: 2660ee946a0246c54d930bc9fa6d2239ce8014b8 * git describe: v4.9.286-26-g2660ee946a02 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.9.y/build/v4.9.28...
## No regressions (compared to v4.9.286)
## No fixes (compared to v4.9.286)
## Test result summary total: 76652, pass: 60442, fail: 639, skip: 13314, xfail: 2257
## Build Summary * arm: 129 total, 129 passed, 0 failed * arm64: 34 total, 34 passed, 0 failed * dragonboard-410c: 1 total, 1 passed, 0 failed * hi6220-hikey: 1 total, 1 passed, 0 failed * i386: 18 total, 18 passed, 0 failed * juno-r2: 1 total, 1 passed, 0 failed * mips: 24 total, 24 passed, 0 failed * sparc: 12 total, 12 passed, 0 failed * x15: 1 total, 1 passed, 0 failed * x86: 1 total, 1 passed, 0 failed * x86_64: 18 total, 18 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-bpf * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers * kselftest-efivarfs * kselftest-filesystems * kselftest-firmware * kselftest-fpu * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kvm-unit-tests * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * perf * ssuite * v4l2-compliance
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On Thu, Oct 14, 2021 at 04:53:31PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.287 release. There are 25 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sat, 16 Oct 2021 14:51:59 +0000. Anything received after that time might be too late.
Build results: total: 163 pass: 163 fail: 0 Qemu test results: total: 394 pass: 394 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org