This is the start of the stable review cycle for the 4.19.17 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 13:48:56 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.17-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.19.17-rc1
Shuah Khan shuah@kernel.org selftests: Fix test errors related to lib.mk khdr target
Ivan Mironov mironov.ivan@gmail.com drm/fb-helper: Ignore the value of fb_var_screeninfo.pixclock
Jaegeuk Kim jaegeuk@kernel.org loop: drop caches if offset or block_size are changed
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()
Jan Kara jack@suse.cz loop: Get rid of 'nested' acquisition of loop_ctl_mutex
Jan Kara jack@suse.cz loop: Avoid circular locking dependency between loop_ctl_mutex and bd_mutex
Jan Kara jack@suse.cz loop: Fix deadlock when calling blkdev_reread_part()
Jan Kara jack@suse.cz loop: Move loop_reread_partitions() out of loop_ctl_mutex
Jan Kara jack@suse.cz loop: Move special partition reread handling in loop_clr_fd()
Jan Kara jack@suse.cz loop: Push loop_ctl_mutex down to loop_change_fd()
Jan Kara jack@suse.cz loop: Push loop_ctl_mutex down to loop_set_fd()
Jan Kara jack@suse.cz loop: Push loop_ctl_mutex down to loop_set_status()
Jan Kara jack@suse.cz loop: Push loop_ctl_mutex down to loop_get_status()
Jan Kara jack@suse.cz loop: Push loop_ctl_mutex down into loop_clr_fd()
Jan Kara jack@suse.cz loop: Split setting of lo_state from loop_clr_fd
Jan Kara jack@suse.cz loop: Push lo_ctl_mutex down into individual ioctls
Jan Kara jack@suse.cz loop: Get rid of loop_index_mutex
Jan Kara jack@suse.cz loop: Fold __loop_release into loop_release
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp block/loop: Use global lock for ioctl() operation.
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp block/loop: Don't grab "struct file" for vfs_getattr() operation.
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_doit
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_name_table_dump
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_link_set
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_bearer_enable
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_link_reset_stats
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in in tipc_conn_rcv_sub
Xin Long lucien.xin@gmail.com sctp: allocate sctp_sockaddr_entry with kzalloc
Jan Kara jack@suse.cz blockdev: Fix livelocks on loop device
Stephen Smalley sds@tycho.nsa.gov selinux: fix GPF on invalid policy
Yufen Yu yuyufen@huawei.com block: use rcu_work instead of call_rcu to avoid sleep in softirq
Shakeel Butt shakeelb@google.com netfilter: ebtables: account ebt_table_info to kmemcg
J. Bruce Fields bfields@redhat.com sunrpc: handle ENOMEM in rpcb_getport_async
Hans Verkuil hverkuil@xs4all.nl media: vb2: vb2_mmap: move lock up
James Morris james.morris@microsoft.com LSM: Check for NULL cred-security on free
Eric Dumazet edumazet@google.com ipv6: make icmp6_send() robust against null skb->dev
Willem de Bruijn willemb@google.com bpf: in __bpf_redirect_no_mac pull mac only if present
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: set min width/height to a value > 0
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: fix error handling of kthread_run
Vlad Tsyrklevich vlad@tsyrklevich.net omap2fb: Fix stack memory disclosure
Florian La Roche florian.laroche@googlemail.com fix int_sqrt64() for very large numbers
YunQiang Su ysu@wavecomp.com Disable MSI also when pcie-octeon.pcie_disable on
Heinrich Schuchardt xypron.glpk@gmx.de arm64: dts: marvell: armada-ap806: reserve PSCI area
Ard Biesheuvel ard.biesheuvel@linaro.org arm64: kaslr: ensure randomized quantities are clean to the PoC
Kees Cook keescook@chromium.org pstore/ram: Avoid allocation and leak of platform data
Johan Hovold johan@kernel.org net: dsa: realtek-smi: fix OF child-node lookup
Paul Burton paul.burton@mips.com kbuild: Disable LD_DEAD_CODE_DATA_ELIMINATION with ftrace & GCC <= 4.7
Adit Ranadive aditr@vmware.com RDMA/vmw_pvrdma: Return the correct opcode when creating WR
Leon Romanovsky leon@kernel.org RDMA/nldev: Don't expose unsafe global rkey to regular user
Sakari Ailus sakari.ailus@linux.intel.com media: v4l: ioctl: Validate num_planes for debug messages
Jonathan Hunter jonathanh@nvidia.com mfd: tps6586x: Handle interrupts on suspend
Julia Lawall Julia.Lawall@lip6.fr OF: properties: add missing of_node_put
Zhenyu Wang zhenyuw@linux.intel.com drm/i915/gvt: Fix mmap range check
Hauke Mehrtens hauke@hauke-m.de MIPS: lantiq: Fix IPI interrupt handling
Rafał Miłecki rafal@milecki.pl MIPS: BCM47XX: Setup struct device for the SoC
Arnd Bergmann arnd@arndb.de mips: fix n32 compat_ipc_parse_version
Ivan Mironov mironov.ivan@gmail.com scsi: sd: Fix cache_type_store()
Stanley Chu stanley.chu@mediatek.com scsi: core: Synchronize request queue PM status only on successful resume
Kees Cook keescook@chromium.org Yama: Check for pid death before checking ancestry
Josef Bacik josef@toxicpanda.com btrfs: wait on ordered extents on abort cleanup
David Sterba dsterba@suse.com Revert "btrfs: balance dirty metadata pages in btrfs_finish_ordered_io"
Juergen Gross jgross@suse.com xen: Fix x86 sched_clock() interface for xen
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - fix ablkcipher for CONFIG_VMAP_STACK
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - reorder code in talitos_edesc_alloc()
Eric Biggers ebiggers@google.com crypto: authenc - fix parsing key with misaligned rta_len
Eric Biggers ebiggers@google.com crypto: bcm - convert to use crypto_authenc_extractkeys()
Eric Biggers ebiggers@google.com crypto: ccree - convert to use crypto_authenc_extractkeys()
Harsh Jain harsh@chelsio.com crypto: authencesn - Avoid twice completion call in decrypt path
Aymen Sghaier aymen.sghaier@nxp.com crypto: caam - fix zero-length buffer DMA mapping
Eric Biggers ebiggers@google.com crypto: sm3 - fix undefined shift by >= width of value
Heiner Kallweit hkallweit1@gmail.com r8169: load Realtek PHY driver module before r8169
Willem de Bruijn willemb@google.com ip: on queued skb use skb_header_pointer instead of pskb_may_pull
Willem de Bruijn willemb@google.com bonding: update nest level on unlink
Heiner Kallweit hkallweit1@gmail.com r8169: don't try to read counters if chip is in a PCI power-save state
Cong Wang xiyou.wangcong@gmail.com smc: move unhash as early as possible in smc_release()
Bryan Whitehead Bryan.Whitehead@microchip.com lan743x: Remove phy_read from link status change function
Stanislav Fomichev sdf@google.com tun: publish tfile after it's fully initialized
Yuchung Cheng ycheng@google.com tcp: change txhash on SYN-data timeout
Jason Gunthorpe jgg@ziepe.ca packet: Do not leak dev refcounts on error exit
JianJhen Chen kchen@synology.com net: bridge: fix a bug on using a neighbour cache entry without checking its state
Eric Dumazet edumazet@google.com ipv6: fix kernel-infoleak in ipv6_local_error()
Mark Rutland mark.rutland@arm.com arm64: Don't trap host pointer auth use to EL2
Mark Rutland mark.rutland@arm.com arm64/kvm: consistently handle host HCR_EL2 flags
Varun Prakash varun@chelsio.com scsi: target: iscsi: cxgbit: fix csk leak - 2
Varun Prakash varun@chelsio.com scsi: target: iscsi: cxgbit: fix csk leak
Sasha Levin sashal@kernel.org Revert "scsi: target: iscsi: cxgbit: fix csk leak"
Loic Poulain loic.poulain@linaro.org mmc: sdhci-msm: Disable CDR function on TX
Florian Westphal fw@strlen.de netfilter: nf_conncount: fix argument order to find_next_bit
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_conncount: speculative garbage collection on empty lists
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_conncount: move all list iterations under spinlock
Florian Westphal fw@strlen.de netfilter: nf_conncount: merge lookup and add functions
Florian Westphal fw@strlen.de netfilter: nf_conncount: restart search when nodes have been erased
Florian Westphal fw@strlen.de netfilter: nf_conncount: split gc in two phases
Florian Westphal fw@strlen.de netfilter: nf_conncount: don't skip eviction when age is negative
Shawn Bohrer sbohrer@cloudflare.com netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS
Oliver Hartkopp socketcan@hartkopp.net can: gw: ensure DLC boundaries after CAN frame modification
Dmitry Safonov dima@arista.com tty: Don't hold ldisc lock in tty_reopen() if ldisc present
Dmitry Safonov dima@arista.com tty: Simplify tty->count math in tty_reopen()
Dmitry Safonov dima@arista.com tty: Hold tty_ldisc_lock() during tty_reopen()
Dmitry Safonov dima@arista.com tty/ldsem: Wake up readers after timed out down_write()
-------------
Diffstat:
Makefile | 4 +- arch/arm64/boot/dts/marvell/armada-ap806.dtsi | 17 + arch/arm64/include/asm/kvm_arm.h | 3 + arch/arm64/kernel/head.S | 5 +- arch/arm64/kernel/kaslr.c | 8 +- arch/arm64/kvm/hyp/switch.c | 2 +- arch/mips/Kconfig | 1 + arch/mips/bcm47xx/setup.c | 31 ++ arch/mips/lantiq/irq.c | 68 +--- arch/mips/pci/msi-octeon.c | 4 +- arch/x86/xen/time.c | 12 +- block/partition-generic.c | 8 +- crypto/authenc.c | 14 +- crypto/authencesn.c | 2 +- crypto/sm3_generic.c | 2 +- drivers/block/loop.c | 443 +++++++++++++-------- drivers/block/loop.h | 1 - drivers/crypto/Kconfig | 1 + drivers/crypto/bcm/cipher.c | 44 +- drivers/crypto/caam/caamhash.c | 15 +- drivers/crypto/ccree/cc_aead.c | 40 +- drivers/crypto/talitos.c | 26 +- drivers/gpu/drm/drm_fb_helper.c | 7 +- drivers/gpu/drm/i915/gvt/kvmgt.c | 14 +- drivers/infiniband/core/nldev.c | 4 - drivers/infiniband/hw/vmw_pvrdma/pvrdma.h | 35 +- drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c | 6 + drivers/media/common/videobuf2/videobuf2-core.c | 11 +- drivers/media/platform/vivid/vivid-kthread-cap.c | 5 +- drivers/media/platform/vivid/vivid-kthread-out.c | 5 +- drivers/media/platform/vivid/vivid-vid-common.c | 2 +- drivers/media/v4l2-core/v4l2-ioctl.c | 4 +- drivers/mfd/tps6586x.c | 24 ++ drivers/mmc/host/sdhci-msm.c | 43 +- drivers/net/bonding/bond_main.c | 3 + drivers/net/dsa/realtek-smi.c | 18 +- drivers/net/ethernet/microchip/lan743x_main.c | 11 +- drivers/net/ethernet/realtek/r8169.c | 7 +- drivers/net/tun.c | 11 +- drivers/of/property.c | 1 + drivers/scsi/scsi_pm.c | 26 +- drivers/scsi/sd.c | 6 + drivers/target/iscsi/cxgbit/cxgbit_cm.c | 23 +- drivers/tty/tty_io.c | 22 +- drivers/tty/tty_ldsem.c | 10 + drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c | 2 + drivers/xen/events/events_base.c | 2 +- fs/block_dev.c | 28 +- fs/btrfs/disk-io.c | 8 + fs/btrfs/inode.c | 3 - fs/pstore/ram.c | 9 +- include/linux/bcma/bcma_soc.h | 1 + include/linux/genhd.h | 2 +- include/net/netfilter/nf_conntrack_count.h | 19 +- include/uapi/rdma/vmw_pvrdma-abi.h | 1 + init/Kconfig | 1 + lib/int_sqrt.c | 2 +- net/bridge/br_netfilter_hooks.c | 2 +- net/bridge/netfilter/ebtables.c | 6 +- net/can/gw.c | 30 +- net/core/filter.c | 21 +- net/core/lwt_bpf.c | 1 + net/ipv4/ip_sockglue.c | 12 +- net/ipv4/tcp_timer.c | 2 +- net/ipv6/datagram.c | 11 +- net/ipv6/icmp.c | 8 +- net/netfilter/nf_conncount.c | 290 ++++++-------- net/netfilter/nft_connlimit.c | 14 +- net/packet/af_packet.c | 4 +- net/sctp/ipv6.c | 5 +- net/sctp/protocol.c | 4 +- net/smc/af_smc.c | 4 +- net/sunrpc/rpcb_clnt.c | 8 + net/tipc/netlink_compat.c | 50 ++- net/tipc/topsrv.c | 2 +- security/security.c | 7 + security/selinux/ss/policydb.c | 3 +- security/yama/yama_lsm.c | 4 +- tools/testing/selftests/android/Makefile | 2 +- tools/testing/selftests/futex/functional/Makefile | 1 + tools/testing/selftests/gpio/Makefile | 1 + tools/testing/selftests/kvm/Makefile | 2 +- tools/testing/selftests/lib.mk | 8 +- .../selftests/networking/timestamping/Makefile | 1 + tools/testing/selftests/vm/Makefile | 1 + 85 files changed, 977 insertions(+), 654 deletions(-)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit 231f8fd0cca078bd4396dd7e380db813ac5736e2 upstream.
ldsem_down_read() will sleep if there is pending writer in the queue. If the writer times out, readers in the queue should be woken up, otherwise they may miss a chance to acquire the semaphore until the last active reader will do ldsem_up_read().
There was a couple of reports where there was one active reader and other readers soft locked up: Showing all locks held in the system: 2 locks held by khungtaskd/17: #0: (rcu_read_lock){......}, at: watchdog+0x124/0x6d1 #1: (tasklist_lock){.+.+..}, at: debug_show_all_locks+0x72/0x2d3 2 locks held by askfirst/123: #0: (&tty->ldisc_sem){.+.+.+}, at: ldsem_down_read+0x46/0x58 #1: (&ldata->atomic_read_lock){+.+...}, at: n_tty_read+0x115/0xbe4
Prevent readers wait for active readers to release ldisc semaphore.
Link: lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com Link: lkml.kernel.org/r/20180907045041.GF1110@shao2-debian Cc: Jiri Slaby jslaby@suse.com Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Reported-by: kernel test robot rong.a.chen@intel.com Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_ldsem.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/tty/tty_ldsem.c +++ b/drivers/tty/tty_ldsem.c @@ -293,6 +293,16 @@ down_write_failed(struct ld_semaphore *s if (!locked) atomic_long_add_return(-LDSEM_WAIT_BIAS, &sem->count); list_del(&waiter.list); + + /* + * In case of timeout, wake up every reader who gave the right of way + * to writer. Prevent separation readers into two groups: + * one that helds semaphore and another that sleeps. + * (in case of no contention with a writer) + */ + if (!locked && list_empty(&sem->write_wait)) + __ldsem_wake_readers(sem); + raw_spin_unlock_irq(&sem->wait_lock);
__set_current_state(TASK_RUNNING);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit 83d817f41070c48bc3eb7ec18e43000a548fca5c upstream.
tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup() nor set_ldisc() nor tty_ldisc_release() as they use tty lock. But it races with anyone who expects line discipline to be the same after hoding read semaphore in tty_ldisc_ref().
We've seen the following crash on v4.9.108 stable:
BUG: unable to handle kernel paging request at 0000000000002260 IP: [..] n_tty_receive_buf_common+0x5f/0x86d Workqueue: events_unbound flush_to_ldisc Call Trace: [..] n_tty_receive_buf2 [..] tty_ldisc_receive_buf [..] flush_to_ldisc [..] process_one_work [..] worker_thread [..] kthread [..] ret_from_fork
tty_ldisc_reinit() should be called with ldisc_sem hold for writing, which will protect any reader against line discipline changes.
Cc: Jiri Slaby jslaby@suse.com Cc: stable@vger.kernel.org # b027e2298bd5 ("tty: fix data race between tty_init_dev and flush of buf") Reviewed-by: Jiri Slaby jslaby@suse.cz Reported-by: syzbot+3aa9784721dfb90e984d@syzkaller.appspotmail.com Tested-by: Mark Rutland mark.rutland@arm.com Tested-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Signed-off-by: Dmitry Safonov dima@arista.com Tested-by: Tycho Andersen tycho@tycho.ws Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_io.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1267,15 +1267,20 @@ static int tty_reopen(struct tty_struct if (test_bit(TTY_EXCLUSIVE, &tty->flags) && !capable(CAP_SYS_ADMIN)) return -EBUSY;
- tty->count++; + retval = tty_ldisc_lock(tty, 5 * HZ); + if (retval) + return retval;
+ tty->count++; if (tty->ldisc) - return 0; + goto out_unlock;
retval = tty_ldisc_reinit(tty, tty->termios.c_line); if (retval) tty->count--;
+out_unlock: + tty_ldisc_unlock(tty); return retval; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit cf62a1a13749db0d32b5cdd800ea91a4087319de upstream.
As notted by Jiri, tty_ldisc_reinit() shouldn't rely on tty counter. Simplify math by increasing the counter after reinit success.
Cc: Jiri Slaby jslaby@suse.com Link: lkml.kernel.org/r/20180829022353.23568-2-dima@arista.com Suggested-by: Jiri Slaby jslaby@suse.com Reviewed-by: Jiri Slaby jslaby@suse.cz Tested-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_io.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1271,16 +1271,13 @@ static int tty_reopen(struct tty_struct if (retval) return retval;
- tty->count++; - if (tty->ldisc) - goto out_unlock; + if (!tty->ldisc) + retval = tty_ldisc_reinit(tty, tty->termios.c_line); + tty_ldisc_unlock(tty);
- retval = tty_ldisc_reinit(tty, tty->termios.c_line); - if (retval) - tty->count--; + if (retval == 0) + tty->count++;
-out_unlock: - tty_ldisc_unlock(tty); return retval; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit d3736d82e8169768218ee0ef68718875918091a0 upstream.
Try to get reference for ldisc during tty_reopen(). If ldisc present, we don't need to do tty_ldisc_reinit() and lock the write side for line discipline semaphore. Effectively, it optimizes fast-path for tty_reopen(), but more importantly it won't interrupt ongoing IO on the tty as no ldisc change is needed. Fixes user-visible issue when tty_reopen() interrupted login process for user with a long password, observed and reported by Lukas.
Fixes: c96cf923a98d ("tty: Don't block on IO when ldisc change is pending") Fixes: 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()") Cc: Jiri Slaby jslaby@suse.com Reported-by: Lukas F. Hartmann lukas@mntmn.com Tested-by: Lukas F. Hartmann lukas@mntmn.com Cc: stable stable@vger.kernel.org Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_io.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1255,7 +1255,8 @@ static void tty_driver_remove_tty(struct static int tty_reopen(struct tty_struct *tty) { struct tty_driver *driver = tty->driver; - int retval; + struct tty_ldisc *ld; + int retval = 0;
if (driver->type == TTY_DRIVER_TYPE_PTY && driver->subtype == PTY_TYPE_MASTER) @@ -1267,13 +1268,18 @@ static int tty_reopen(struct tty_struct if (test_bit(TTY_EXCLUSIVE, &tty->flags) && !capable(CAP_SYS_ADMIN)) return -EBUSY;
- retval = tty_ldisc_lock(tty, 5 * HZ); - if (retval) - return retval; - - if (!tty->ldisc) - retval = tty_ldisc_reinit(tty, tty->termios.c_line); - tty_ldisc_unlock(tty); + ld = tty_ldisc_ref_wait(tty); + if (ld) { + tty_ldisc_deref(ld); + } else { + retval = tty_ldisc_lock(tty, 5 * HZ); + if (retval) + return retval; + + if (!tty->ldisc) + retval = tty_ldisc_reinit(tty, tty->termios.c_line); + tty_ldisc_unlock(tty); + }
if (retval == 0) tty->count++;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Hartkopp socketcan@hartkopp.net
commit 0aaa81377c5a01f686bcdb8c7a6929a7bf330c68 upstream.
Muyu Yu provided a POC where user root with CAP_NET_ADMIN can create a CAN frame modification rule that makes the data length code a higher value than the available CAN frame data size. In combination with a configured checksum calculation where the result is stored relatively to the end of the data (e.g. cgw_csum_xor_rel) the tail of the skb (e.g. frag_list pointer in skb_shared_info) can be rewritten which finally can cause a system crash.
Michael Kubecek suggested to drop frames that have a DLC exceeding the available space after the modification process and provided a patch that can handle CAN FD frames too. Within this patch we also limit the length for the checksum calculations to the maximum of Classic CAN data length (8).
CAN frames that are dropped by these additional checks are counted with the CGW_DELETED counter which indicates misconfigurations in can-gw rules.
This fixes CVE-2019-3701.
Reported-by: Muyu Yu ieatmuttonchuan@gmail.com Reported-by: Marcus Meissner meissner@suse.de Suggested-by: Michal Kubecek mkubecek@suse.cz Tested-by: Muyu Yu ieatmuttonchuan@gmail.com Tested-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Cc: linux-stable stable@vger.kernel.org # >= v3.2 Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/can/gw.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-)
--- a/net/can/gw.c +++ b/net/can/gw.c @@ -416,13 +416,29 @@ static void can_can_gw_rcv(struct sk_buf while (modidx < MAX_MODFUNCTIONS && gwj->mod.modfunc[modidx]) (*gwj->mod.modfunc[modidx++])(cf, &gwj->mod);
- /* check for checksum updates when the CAN frame has been modified */ + /* Has the CAN frame been modified? */ if (modidx) { - if (gwj->mod.csumfunc.crc8) + /* get available space for the processed CAN frame type */ + int max_len = nskb->len - offsetof(struct can_frame, data); + + /* dlc may have changed, make sure it fits to the CAN frame */ + if (cf->can_dlc > max_len) + goto out_delete; + + /* check for checksum updates in classic CAN length only */ + if (gwj->mod.csumfunc.crc8) { + if (cf->can_dlc > 8) + goto out_delete; + (*gwj->mod.csumfunc.crc8)(cf, &gwj->mod.csum.crc8); + } + + if (gwj->mod.csumfunc.xor) { + if (cf->can_dlc > 8) + goto out_delete;
- if (gwj->mod.csumfunc.xor) (*gwj->mod.csumfunc.xor)(cf, &gwj->mod.csum.xor); + } }
/* clear the skb timestamp if not configured the other way */ @@ -434,6 +450,14 @@ static void can_can_gw_rcv(struct sk_buf gwj->dropped_frames++; else gwj->handled_frames++; + + return; + + out_delete: + /* delete frame due to misconfiguration */ + gwj->deleted_frames++; + kfree_skb(nskb); + return; }
static inline int cgw_register_filter(struct net *net, struct cgw_job *gwj)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shawn Bohrer sbohrer@cloudflare.com
commit c78e7818f16f687389174c4569243abbec8dc68f upstream.
Most of the time these were the same value anyway, but when CONFIG_LOCKDEP was enabled we would use a smaller number of locks to reduce overhead. Unfortunately having two values is confusing and not worth the complexity.
This fixes a bug where tree_gc_worker() would only GC up to CONNCOUNT_LOCK_SLOTS trees which meant when CONFIG_LOCKDEP was enabled not all trees would be GCed by tree_gc_worker().
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conncount.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-)
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -33,12 +33,6 @@
#define CONNCOUNT_SLOTS 256U
-#ifdef CONFIG_LOCKDEP -#define CONNCOUNT_LOCK_SLOTS 8U -#else -#define CONNCOUNT_LOCK_SLOTS 256U -#endif - #define CONNCOUNT_GC_MAX_NODES 8 #define MAX_KEYLEN 5
@@ -60,7 +54,7 @@ struct nf_conncount_rb { struct rcu_head rcu_head; };
-static spinlock_t nf_conncount_locks[CONNCOUNT_LOCK_SLOTS] __cacheline_aligned_in_smp; +static spinlock_t nf_conncount_locks[CONNCOUNT_SLOTS] __cacheline_aligned_in_smp;
struct nf_conncount_data { unsigned int keylen; @@ -353,7 +347,7 @@ insert_tree(struct net *net, unsigned int count = 0, gc_count = 0; bool node_found = false;
- spin_lock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]); + spin_lock_bh(&nf_conncount_locks[hash]);
parent = NULL; rbnode = &(root->rb_node); @@ -430,7 +424,7 @@ insert_tree(struct net *net, rb_link_node_rcu(&rbconn->node, parent, rbnode); rb_insert_color(&rbconn->node, root); out_unlock: - spin_unlock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]); + spin_unlock_bh(&nf_conncount_locks[hash]); return count; }
@@ -499,7 +493,7 @@ static void tree_gc_worker(struct work_s struct rb_node *node; unsigned int tree, next_tree, gc_count = 0;
- tree = data->gc_tree % CONNCOUNT_LOCK_SLOTS; + tree = data->gc_tree % CONNCOUNT_SLOTS; root = &data->root[tree];
rcu_read_lock(); @@ -621,10 +615,7 @@ static int __init nf_conncount_modinit(v { int i;
- BUILD_BUG_ON(CONNCOUNT_LOCK_SLOTS > CONNCOUNT_SLOTS); - BUILD_BUG_ON((CONNCOUNT_SLOTS % CONNCOUNT_LOCK_SLOTS) != 0); - - for (i = 0; i < CONNCOUNT_LOCK_SLOTS; ++i) + for (i = 0; i < CONNCOUNT_SLOTS; ++i) spin_lock_init(&nf_conncount_locks[i]);
conncount_conn_cachep = kmem_cache_create("nf_conncount_tuple",
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit 4cd273bb91b3001f623f516ec726c49754571b1a upstream.
age is signed integer, so result can be negative when the timestamps have a large delta. In this case we want to discard the entry.
Instead of using age >= 2 || age < 0, just make it unsigned.
Fixes: b36e4523d4d56 ("netfilter: nf_conncount: fix garbage collection confirm race") Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conncount.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -155,7 +155,7 @@ find_or_evict(struct net *net, struct nf const struct nf_conntrack_tuple_hash *found; unsigned long a, b; int cpu = raw_smp_processor_id(); - __s32 age; + u32 age;
found = nf_conntrack_find_get(net, &conn->zone, &conn->tuple); if (found)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit f7fcc98dfc2d136722007fec0debbed761679b94 upstream.
The lockless workqueue garbage collector can race with packet path garbage collector to delete list nodes, as it calls tree_nodes_free() with the addresses of nodes that might have been free'd already from another cpu.
To fix this, split gc into two phases.
One phase to perform gc on the connections: From a locking perspective, this is the same as count_tree(): we hold rcu lock, but we do not change the tree, we only change the nodes' contents.
The second phase acquires the tree lock and reaps empty nodes. This avoids a race condition of the garbage collection vs. packet path: If a node has been free'd already, the second phase won't find it anymore.
This second phase is, from locking perspective, same as insert_tree().
The former only modifies nodes (list content, count), latter modifies the tree itself (rb_erase or rb_insert).
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conncount.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-)
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -500,16 +500,32 @@ static void tree_gc_worker(struct work_s for (node = rb_first(root); node != NULL; node = rb_next(node)) { rbconn = rb_entry(node, struct nf_conncount_rb, node); if (nf_conncount_gc_list(data->net, &rbconn->list)) - gc_nodes[gc_count++] = rbconn; + gc_count++; } rcu_read_unlock();
spin_lock_bh(&nf_conncount_locks[tree]); + if (gc_count < ARRAY_SIZE(gc_nodes)) + goto next; /* do not bother */
- if (gc_count) { - tree_nodes_free(root, gc_nodes, gc_count); + gc_count = 0; + node = rb_first(root); + while (node != NULL) { + rbconn = rb_entry(node, struct nf_conncount_rb, node); + node = rb_next(node); + + if (rbconn->list.count > 0) + continue; + + gc_nodes[gc_count++] = rbconn; + if (gc_count >= ARRAY_SIZE(gc_nodes)) { + tree_nodes_free(root, gc_nodes, gc_count); + gc_count = 0; + } }
+ tree_nodes_free(root, gc_nodes, gc_count); +next: clear_bit(tree, data->pending_trees);
next_tree = (tree + 1) % CONNCOUNT_SLOTS;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit e8cfb372b38a1b8979aa7f7631fb5e7b11c3793c upstream.
Shawn Bohrer reported a following crash: |RIP: 0010:rb_erase+0xae/0x360 [..] Call Trace: nf_conncount_destroy+0x59/0xc0 [nf_conncount] cleanup_match+0x45/0x70 [ip_tables] ...
Shawn tracked this down to bogus 'parent' pointer: Problem is that when we insert a new node, then there is a chance that the 'parent' that we found was also passed to tree_nodes_free() (because that node was empty) for erase+free.
Instead of trying to be clever and detect when this happens, restart the search if we have evicted one or more nodes. To prevent frequent restarts, do not perform gc on the second round.
Also, unconditionally schedule the gc worker. The condition
gc_count > ARRAY_SIZE(gc_nodes))
cannot be true unless tree grows very large, as the height of the tree will be low even with hundreds of nodes present.
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") Reported-by: Shawn Bohrer sbohrer@cloudflare.com Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conncount.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -346,9 +346,10 @@ insert_tree(struct net *net, struct nf_conncount_tuple *conn; unsigned int count = 0, gc_count = 0; bool node_found = false; + bool do_gc = true;
spin_lock_bh(&nf_conncount_locks[hash]); - +restart: parent = NULL; rbnode = &(root->rb_node); while (*rbnode) { @@ -381,21 +382,16 @@ insert_tree(struct net *net, if (gc_count >= ARRAY_SIZE(gc_nodes)) continue;
- if (nf_conncount_gc_list(net, &rbconn->list)) + if (do_gc && nf_conncount_gc_list(net, &rbconn->list)) gc_nodes[gc_count++] = rbconn; }
if (gc_count) { tree_nodes_free(root, gc_nodes, gc_count); - /* tree_node_free before new allocation permits - * allocator to re-use newly free'd object. - * - * This is a rare event; in most cases we will find - * existing node to re-use. (or gc_count is 0). - */ - - if (gc_count >= ARRAY_SIZE(gc_nodes)) - schedule_gc_worker(data, hash); + schedule_gc_worker(data, hash); + gc_count = 0; + do_gc = false; + goto restart; }
if (node_found)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit df4a902509766897f7371fdfa4c3bf8bc321b55d upstream.
'lookup' is always followed by 'add'. Merge both and make the list-walk part of nf_conncount_add().
This also avoids one unneeded unlock/re-lock pair.
Extra care needs to be taken in count_tree, as we only hold rcu read lock, i.e. we can only insert to an existing tree node after acquiring its lock and making sure it has a nonzero count.
As a zero count should be rare, just fall back to insert_tree() (which acquires tree lock).
This issue and its solution were pointed out by Shawn Bohrer during patch review.
Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/net/netfilter/nf_conntrack_count.h | 18 --- net/netfilter/nf_conncount.c | 146 +++++++++++++---------------- net/netfilter/nft_connlimit.c | 14 -- 3 files changed, 72 insertions(+), 106 deletions(-)
--- a/include/net/netfilter/nf_conntrack_count.h +++ b/include/net/netfilter/nf_conntrack_count.h @@ -5,12 +5,6 @@
struct nf_conncount_data;
-enum nf_conncount_list_add { - NF_CONNCOUNT_ADDED, /* list add was ok */ - NF_CONNCOUNT_ERR, /* -ENOMEM, must drop skb */ - NF_CONNCOUNT_SKIP, /* list is already reclaimed by gc */ -}; - struct nf_conncount_list { spinlock_t list_lock; struct list_head head; /* connections with the same filtering key */ @@ -29,18 +23,12 @@ unsigned int nf_conncount_count(struct n const struct nf_conntrack_tuple *tuple, const struct nf_conntrack_zone *zone);
-void nf_conncount_lookup(struct net *net, struct nf_conncount_list *list, - const struct nf_conntrack_tuple *tuple, - const struct nf_conntrack_zone *zone, - bool *addit); +int nf_conncount_add(struct net *net, struct nf_conncount_list *list, + const struct nf_conntrack_tuple *tuple, + const struct nf_conntrack_zone *zone);
void nf_conncount_list_init(struct nf_conncount_list *list);
-enum nf_conncount_list_add -nf_conncount_add(struct nf_conncount_list *list, - const struct nf_conntrack_tuple *tuple, - const struct nf_conntrack_zone *zone); - bool nf_conncount_gc_list(struct net *net, struct nf_conncount_list *list);
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -83,38 +83,6 @@ static int key_diff(const u32 *a, const return memcmp(a, b, klen * sizeof(u32)); }
-enum nf_conncount_list_add -nf_conncount_add(struct nf_conncount_list *list, - const struct nf_conntrack_tuple *tuple, - const struct nf_conntrack_zone *zone) -{ - struct nf_conncount_tuple *conn; - - if (WARN_ON_ONCE(list->count > INT_MAX)) - return NF_CONNCOUNT_ERR; - - conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC); - if (conn == NULL) - return NF_CONNCOUNT_ERR; - - conn->tuple = *tuple; - conn->zone = *zone; - conn->cpu = raw_smp_processor_id(); - conn->jiffies32 = (u32)jiffies; - conn->dead = false; - spin_lock_bh(&list->list_lock); - if (list->dead == true) { - kmem_cache_free(conncount_conn_cachep, conn); - spin_unlock_bh(&list->list_lock); - return NF_CONNCOUNT_SKIP; - } - list_add_tail(&conn->node, &list->head); - list->count++; - spin_unlock_bh(&list->list_lock); - return NF_CONNCOUNT_ADDED; -} -EXPORT_SYMBOL_GPL(nf_conncount_add); - static void __conn_free(struct rcu_head *h) { struct nf_conncount_tuple *conn; @@ -177,11 +145,10 @@ find_or_evict(struct net *net, struct nf return ERR_PTR(-EAGAIN); }
-void nf_conncount_lookup(struct net *net, - struct nf_conncount_list *list, - const struct nf_conntrack_tuple *tuple, - const struct nf_conntrack_zone *zone, - bool *addit) +static int __nf_conncount_add(struct net *net, + struct nf_conncount_list *list, + const struct nf_conntrack_tuple *tuple, + const struct nf_conntrack_zone *zone) { const struct nf_conntrack_tuple_hash *found; struct nf_conncount_tuple *conn, *conn_n; @@ -189,9 +156,6 @@ void nf_conncount_lookup(struct net *net unsigned int collect = 0; bool free_entry = false;
- /* best effort only */ - *addit = tuple ? true : false; - /* check the saved connections */ list_for_each_entry_safe(conn, conn_n, &list->head, node) { if (collect > CONNCOUNT_GC_MAX_NODES) @@ -201,21 +165,19 @@ void nf_conncount_lookup(struct net *net if (IS_ERR(found)) { /* Not found, but might be about to be confirmed */ if (PTR_ERR(found) == -EAGAIN) { - if (!tuple) - continue; - if (nf_ct_tuple_equal(&conn->tuple, tuple) && nf_ct_zone_id(&conn->zone, conn->zone.dir) == nf_ct_zone_id(zone, zone->dir)) - *addit = false; - } else if (PTR_ERR(found) == -ENOENT) + return 0; /* already exists */ + } else { collect++; + } continue; }
found_ct = nf_ct_tuplehash_to_ctrack(found);
- if (tuple && nf_ct_tuple_equal(&conn->tuple, tuple) && + if (nf_ct_tuple_equal(&conn->tuple, tuple) && nf_ct_zone_equal(found_ct, zone, zone->dir)) { /* * We should not see tuples twice unless someone hooks @@ -223,7 +185,8 @@ void nf_conncount_lookup(struct net *net * * Attempt to avoid a re-add in this case. */ - *addit = false; + nf_ct_put(found_ct); + return 0; } else if (already_closed(found_ct)) { /* * we do not care about connections which are @@ -237,8 +200,38 @@ void nf_conncount_lookup(struct net *net
nf_ct_put(found_ct); } + + if (WARN_ON_ONCE(list->count > INT_MAX)) + return -EOVERFLOW; + + conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC); + if (conn == NULL) + return -ENOMEM; + + conn->tuple = *tuple; + conn->zone = *zone; + conn->cpu = raw_smp_processor_id(); + conn->jiffies32 = (u32)jiffies; + list_add_tail(&conn->node, &list->head); + list->count++; + return 0; +} + +int nf_conncount_add(struct net *net, + struct nf_conncount_list *list, + const struct nf_conntrack_tuple *tuple, + const struct nf_conntrack_zone *zone) +{ + int ret; + + /* check the saved connections */ + spin_lock_bh(&list->list_lock); + ret = __nf_conncount_add(net, list, tuple, zone); + spin_unlock_bh(&list->list_lock); + + return ret; } -EXPORT_SYMBOL_GPL(nf_conncount_lookup); +EXPORT_SYMBOL_GPL(nf_conncount_add);
void nf_conncount_list_init(struct nf_conncount_list *list) { @@ -339,13 +332,11 @@ insert_tree(struct net *net, const struct nf_conntrack_tuple *tuple, const struct nf_conntrack_zone *zone) { - enum nf_conncount_list_add ret; struct nf_conncount_rb *gc_nodes[CONNCOUNT_GC_MAX_NODES]; struct rb_node **rbnode, *parent; struct nf_conncount_rb *rbconn; struct nf_conncount_tuple *conn; unsigned int count = 0, gc_count = 0; - bool node_found = false; bool do_gc = true;
spin_lock_bh(&nf_conncount_locks[hash]); @@ -363,20 +354,15 @@ restart: } else if (diff > 0) { rbnode = &((*rbnode)->rb_right); } else { - /* unlikely: other cpu added node already */ - node_found = true; - ret = nf_conncount_add(&rbconn->list, tuple, zone); - if (ret == NF_CONNCOUNT_ERR) { + int ret; + + ret = nf_conncount_add(net, &rbconn->list, tuple, zone); + if (ret) count = 0; /* hotdrop */ - } else if (ret == NF_CONNCOUNT_ADDED) { + else count = rbconn->list.count; - } else { - /* NF_CONNCOUNT_SKIP, rbconn is already - * reclaimed by gc, insert a new tree node - */ - node_found = false; - } - break; + tree_nodes_free(root, gc_nodes, gc_count); + goto out_unlock; }
if (gc_count >= ARRAY_SIZE(gc_nodes)) @@ -394,9 +380,6 @@ restart: goto restart; }
- if (node_found) - goto out_unlock; - /* expected case: match, insert new node */ rbconn = kmem_cache_alloc(conncount_rb_cachep, GFP_ATOMIC); if (rbconn == NULL) @@ -431,7 +414,6 @@ count_tree(struct net *net, const struct nf_conntrack_tuple *tuple, const struct nf_conntrack_zone *zone) { - enum nf_conncount_list_add ret; struct rb_root *root; struct rb_node *parent; struct nf_conncount_rb *rbconn; @@ -444,7 +426,6 @@ count_tree(struct net *net, parent = rcu_dereference_raw(root->rb_node); while (parent) { int diff; - bool addit;
rbconn = rb_entry(parent, struct nf_conncount_rb, node);
@@ -454,24 +435,29 @@ count_tree(struct net *net, } else if (diff > 0) { parent = rcu_dereference_raw(parent->rb_right); } else { - /* same source network -> be counted! */ - nf_conncount_lookup(net, &rbconn->list, tuple, zone, - &addit); + int ret;
- if (!addit) + if (!tuple) { + nf_conncount_gc_list(net, &rbconn->list); return rbconn->list.count; + }
- ret = nf_conncount_add(&rbconn->list, tuple, zone); - if (ret == NF_CONNCOUNT_ERR) { - return 0; /* hotdrop */ - } else if (ret == NF_CONNCOUNT_ADDED) { - return rbconn->list.count; - } else { - /* NF_CONNCOUNT_SKIP, rbconn is already - * reclaimed by gc, insert a new tree node - */ + spin_lock_bh(&rbconn->list.list_lock); + /* Node might be about to be free'd. + * We need to defer to insert_tree() in this case. + */ + if (rbconn->list.count == 0) { + spin_unlock_bh(&rbconn->list.list_lock); break; } + + /* same source network -> be counted! */ + ret = __nf_conncount_add(net, &rbconn->list, tuple, zone); + spin_unlock_bh(&rbconn->list.list_lock); + if (ret) + return 0; /* hotdrop */ + else + return rbconn->list.count; } }
--- a/net/netfilter/nft_connlimit.c +++ b/net/netfilter/nft_connlimit.c @@ -30,7 +30,6 @@ static inline void nft_connlimit_do_eval enum ip_conntrack_info ctinfo; const struct nf_conn *ct; unsigned int count; - bool addit;
tuple_ptr = &tuple;
@@ -44,19 +43,12 @@ static inline void nft_connlimit_do_eval return; }
- nf_conncount_lookup(nft_net(pkt), &priv->list, tuple_ptr, zone, - &addit); - count = priv->list.count; - - if (!addit) - goto out; - - if (nf_conncount_add(&priv->list, tuple_ptr, zone) == NF_CONNCOUNT_ERR) { + if (nf_conncount_add(nft_net(pkt), &priv->list, tuple_ptr, zone)) { regs->verdict.code = NF_DROP; return; } - count++; -out: + + count = priv->list.count;
if ((count > priv->limit) ^ priv->invert) { regs->verdict.code = NFT_BREAK;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit 2f971a8f425545da52ca0e6bee81f5b1ea0ccc5f upstream.
Two CPUs may race to remove a connection from the list, the existing conn->dead will result in a use-after-free. Use the per-list spinlock to protect list iterations.
As all accesses to the list now happen while holding the per-list lock, we no longer need to delay free operations with rcu.
Joint work with Florian.
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conncount.c | 46 ++++++++++++++++++------------------------- 1 file changed, 20 insertions(+), 26 deletions(-)
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -43,8 +43,6 @@ struct nf_conncount_tuple { struct nf_conntrack_zone zone; int cpu; u32 jiffies32; - bool dead; - struct rcu_head rcu_head; };
struct nf_conncount_rb { @@ -83,36 +81,21 @@ static int key_diff(const u32 *a, const return memcmp(a, b, klen * sizeof(u32)); }
-static void __conn_free(struct rcu_head *h) -{ - struct nf_conncount_tuple *conn; - - conn = container_of(h, struct nf_conncount_tuple, rcu_head); - kmem_cache_free(conncount_conn_cachep, conn); -} - static bool conn_free(struct nf_conncount_list *list, struct nf_conncount_tuple *conn) { bool free_entry = false;
- spin_lock_bh(&list->list_lock); - - if (conn->dead) { - spin_unlock_bh(&list->list_lock); - return free_entry; - } + lockdep_assert_held(&list->list_lock);
list->count--; - conn->dead = true; - list_del_rcu(&conn->node); + list_del(&conn->node); if (list->count == 0) { list->dead = true; free_entry = true; }
- spin_unlock_bh(&list->list_lock); - call_rcu(&conn->rcu_head, __conn_free); + kmem_cache_free(conncount_conn_cachep, conn); return free_entry; }
@@ -242,7 +225,7 @@ void nf_conncount_list_init(struct nf_co } EXPORT_SYMBOL_GPL(nf_conncount_list_init);
-/* Return true if the list is empty */ +/* Return true if the list is empty. Must be called with BH disabled. */ bool nf_conncount_gc_list(struct net *net, struct nf_conncount_list *list) { @@ -253,12 +236,18 @@ bool nf_conncount_gc_list(struct net *ne bool free_entry = false; bool ret = false;
+ /* don't bother if other cpu is already doing GC */ + if (!spin_trylock(&list->list_lock)) + return false; + list_for_each_entry_safe(conn, conn_n, &list->head, node) { found = find_or_evict(net, list, conn, &free_entry); if (IS_ERR(found)) { if (PTR_ERR(found) == -ENOENT) { - if (free_entry) + if (free_entry) { + spin_unlock(&list->list_lock); return true; + } collected++; } continue; @@ -271,23 +260,24 @@ bool nf_conncount_gc_list(struct net *ne * closed already -> ditch it */ nf_ct_put(found_ct); - if (conn_free(list, conn)) + if (conn_free(list, conn)) { + spin_unlock(&list->list_lock); return true; + } collected++; continue; }
nf_ct_put(found_ct); if (collected > CONNCOUNT_GC_MAX_NODES) - return false; + break; }
- spin_lock_bh(&list->list_lock); if (!list->count) { list->dead = true; ret = true; } - spin_unlock_bh(&list->list_lock); + spin_unlock(&list->list_lock);
return ret; } @@ -478,6 +468,7 @@ static void tree_gc_worker(struct work_s tree = data->gc_tree % CONNCOUNT_SLOTS; root = &data->root[tree];
+ local_bh_disable(); rcu_read_lock(); for (node = rb_first(root); node != NULL; node = rb_next(node)) { rbconn = rb_entry(node, struct nf_conncount_rb, node); @@ -485,6 +476,9 @@ static void tree_gc_worker(struct work_s gc_count++; } rcu_read_unlock(); + local_bh_enable(); + + cond_resched();
spin_lock_bh(&nf_conncount_locks[tree]); if (gc_count < ARRAY_SIZE(gc_nodes))
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
commit c80f10bc973af2ace6b1414724eeff61eaa71837 upstream.
Instead of removing a empty list node that might be reintroduced soon thereafter, tentatively place the empty list node on the list passed to tree_nodes_free(), then re-check if the list is empty again before erasing it from the tree.
[ Florian: rebase on top of pending nf_conncount fixes ]
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/net/netfilter/nf_conntrack_count.h | 1 net/netfilter/nf_conncount.c | 47 +++++++++-------------------- 2 files changed, 15 insertions(+), 33 deletions(-)
--- a/include/net/netfilter/nf_conntrack_count.h +++ b/include/net/netfilter/nf_conntrack_count.h @@ -9,7 +9,6 @@ struct nf_conncount_list { spinlock_t list_lock; struct list_head head; /* connections with the same filtering key */ unsigned int count; /* length of list */ - bool dead; };
struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int family, --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -81,27 +81,20 @@ static int key_diff(const u32 *a, const return memcmp(a, b, klen * sizeof(u32)); }
-static bool conn_free(struct nf_conncount_list *list, +static void conn_free(struct nf_conncount_list *list, struct nf_conncount_tuple *conn) { - bool free_entry = false; - lockdep_assert_held(&list->list_lock);
list->count--; list_del(&conn->node); - if (list->count == 0) { - list->dead = true; - free_entry = true; - }
kmem_cache_free(conncount_conn_cachep, conn); - return free_entry; }
static const struct nf_conntrack_tuple_hash * find_or_evict(struct net *net, struct nf_conncount_list *list, - struct nf_conncount_tuple *conn, bool *free_entry) + struct nf_conncount_tuple *conn) { const struct nf_conntrack_tuple_hash *found; unsigned long a, b; @@ -121,7 +114,7 @@ find_or_evict(struct net *net, struct nf */ age = a - b; if (conn->cpu == cpu || age >= 2) { - *free_entry = conn_free(list, conn); + conn_free(list, conn); return ERR_PTR(-ENOENT); }
@@ -137,14 +130,13 @@ static int __nf_conncount_add(struct net struct nf_conncount_tuple *conn, *conn_n; struct nf_conn *found_ct; unsigned int collect = 0; - bool free_entry = false;
/* check the saved connections */ list_for_each_entry_safe(conn, conn_n, &list->head, node) { if (collect > CONNCOUNT_GC_MAX_NODES) break;
- found = find_or_evict(net, list, conn, &free_entry); + found = find_or_evict(net, list, conn); if (IS_ERR(found)) { /* Not found, but might be about to be confirmed */ if (PTR_ERR(found) == -EAGAIN) { @@ -221,7 +213,6 @@ void nf_conncount_list_init(struct nf_co spin_lock_init(&list->list_lock); INIT_LIST_HEAD(&list->head); list->count = 0; - list->dead = false; } EXPORT_SYMBOL_GPL(nf_conncount_list_init);
@@ -233,7 +224,6 @@ bool nf_conncount_gc_list(struct net *ne struct nf_conncount_tuple *conn, *conn_n; struct nf_conn *found_ct; unsigned int collected = 0; - bool free_entry = false; bool ret = false;
/* don't bother if other cpu is already doing GC */ @@ -241,15 +231,10 @@ bool nf_conncount_gc_list(struct net *ne return false;
list_for_each_entry_safe(conn, conn_n, &list->head, node) { - found = find_or_evict(net, list, conn, &free_entry); + found = find_or_evict(net, list, conn); if (IS_ERR(found)) { - if (PTR_ERR(found) == -ENOENT) { - if (free_entry) { - spin_unlock(&list->list_lock); - return true; - } + if (PTR_ERR(found) == -ENOENT) collected++; - } continue; }
@@ -260,10 +245,7 @@ bool nf_conncount_gc_list(struct net *ne * closed already -> ditch it */ nf_ct_put(found_ct); - if (conn_free(list, conn)) { - spin_unlock(&list->list_lock); - return true; - } + conn_free(list, conn); collected++; continue; } @@ -273,10 +255,8 @@ bool nf_conncount_gc_list(struct net *ne break; }
- if (!list->count) { - list->dead = true; + if (!list->count) ret = true; - } spin_unlock(&list->list_lock);
return ret; @@ -291,6 +271,7 @@ static void __tree_nodes_free(struct rcu kmem_cache_free(conncount_rb_cachep, rbconn); }
+/* caller must hold tree nf_conncount_locks[] lock */ static void tree_nodes_free(struct rb_root *root, struct nf_conncount_rb *gc_nodes[], unsigned int gc_count) @@ -300,8 +281,10 @@ static void tree_nodes_free(struct rb_ro while (gc_count) { rbconn = gc_nodes[--gc_count]; spin_lock(&rbconn->list.list_lock); - rb_erase(&rbconn->node, root); - call_rcu(&rbconn->rcu_head, __tree_nodes_free); + if (!rbconn->list.count) { + rb_erase(&rbconn->node, root); + call_rcu(&rbconn->rcu_head, __tree_nodes_free); + } spin_unlock(&rbconn->list.list_lock); } } @@ -318,7 +301,6 @@ insert_tree(struct net *net, struct rb_root *root, unsigned int hash, const u32 *key, - u8 keylen, const struct nf_conntrack_tuple *tuple, const struct nf_conntrack_zone *zone) { @@ -327,6 +309,7 @@ insert_tree(struct net *net, struct nf_conncount_rb *rbconn; struct nf_conncount_tuple *conn; unsigned int count = 0, gc_count = 0; + u8 keylen = data->keylen; bool do_gc = true;
spin_lock_bh(&nf_conncount_locks[hash]); @@ -454,7 +437,7 @@ count_tree(struct net *net, if (!tuple) return 0;
- return insert_tree(net, data, root, hash, key, keylen, tuple, zone); + return insert_tree(net, data, root, hash, key, tuple, zone); }
static void tree_gc_worker(struct work_struct *work)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
commit a007232066f6839d6f256bab21e825d968f1a163 upstream.
Size and 'next bit' were swapped, this bug could cause worker to reschedule itself even if system was idle.
Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search") Reviewed-by: Shawn Bohrer sbohrer@cloudflare.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conncount.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -488,7 +488,7 @@ next: clear_bit(tree, data->pending_trees);
next_tree = (tree + 1) % CONNCOUNT_SLOTS; - next_tree = find_next_bit(data->pending_trees, next_tree, CONNCOUNT_SLOTS); + next_tree = find_next_bit(data->pending_trees, CONNCOUNT_SLOTS, next_tree);
if (next_tree < CONNCOUNT_SLOTS) { data->gc_tree = next_tree;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Loic Poulain loic.poulain@linaro.org
commit a89e7bcb18081c611eb6cf50edd440fa4983a71a upstream.
The Clock Data Recovery (CDR) circuit allows to automatically adjust the RX sampling-point/phase for high frequency cards (SDR104, HS200...). CDR is automatically enabled during DLL configuration. However, according to the APQ8016 reference manual, this function must be disabled during TX and tuning phase in order to prevent any interferences during tuning challenges and unexpected phase alteration during TX transfers.
This patch enables/disables CDR according to the current transfer mode.
This fixes sporadic write transfer issues observed with some SDR104 and HS200 cards.
Inspired by sdhci-msm downstream patch: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4...
Reported-by: Leonid Segal leonid.s@variscite.com Reported-by: Manabu Igusa migusa@arrowjapan.com Signed-off-by: Loic Poulain loic.poulain@linaro.org Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Georgi Djakov georgi.djakov@linaro.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org [georgi: backport to v4.19+] Signed-off-by: Georgi Djakov georgi.djakov@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-msm.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -258,6 +258,8 @@ struct sdhci_msm_host { bool mci_removed; const struct sdhci_msm_variant_ops *var_ops; const struct sdhci_msm_offset *offset; + bool use_cdr; + u32 transfer_mode; };
static const struct sdhci_msm_offset *sdhci_priv_msm_offset(struct sdhci_host *host) @@ -1025,6 +1027,26 @@ out: return ret; }
+static void sdhci_msm_set_cdr(struct sdhci_host *host, bool enable) +{ + const struct sdhci_msm_offset *msm_offset = sdhci_priv_msm_offset(host); + u32 config, oldconfig = readl_relaxed(host->ioaddr + + msm_offset->core_dll_config); + + config = oldconfig; + if (enable) { + config |= CORE_CDR_EN; + config &= ~CORE_CDR_EXT_EN; + } else { + config &= ~CORE_CDR_EN; + config |= CORE_CDR_EXT_EN; + } + + if (config != oldconfig) + writel_relaxed(config, host->ioaddr + + msm_offset->core_dll_config); +} + static int sdhci_msm_execute_tuning(struct mmc_host *mmc, u32 opcode) { struct sdhci_host *host = mmc_priv(mmc); @@ -1042,8 +1064,14 @@ static int sdhci_msm_execute_tuning(stru if (host->clock <= CORE_FREQ_100MHZ || !(ios.timing == MMC_TIMING_MMC_HS400 || ios.timing == MMC_TIMING_MMC_HS200 || - ios.timing == MMC_TIMING_UHS_SDR104)) + ios.timing == MMC_TIMING_UHS_SDR104)) { + msm_host->use_cdr = false; + sdhci_msm_set_cdr(host, false); return 0; + } + + /* Clock-Data-Recovery used to dynamically adjust RX sampling point */ + msm_host->use_cdr = true;
/* * For HS400 tuning in HS200 timing requires: @@ -1525,6 +1553,19 @@ static int __sdhci_msm_check_write(struc case SDHCI_POWER_CONTROL: req_type = !val ? REQ_BUS_OFF : REQ_BUS_ON; break; + case SDHCI_TRANSFER_MODE: + msm_host->transfer_mode = val; + break; + case SDHCI_COMMAND: + if (!msm_host->use_cdr) + break; + if ((msm_host->transfer_mode & SDHCI_TRNS_READ) && + SDHCI_GET_CMD(val) != MMC_SEND_TUNING_BLOCK_HS200 && + SDHCI_GET_CMD(val) != MMC_SEND_TUNING_BLOCK) + sdhci_msm_set_cdr(host, true); + else + sdhci_msm_set_cdr(host, false); + break; }
if (req_type) {
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
This reverts commit c9cef2c71a89a2c926dae8151f9497e72f889315.
A wrong commit message was used for the stable commit because of a human error (and duplicate commit subject lines).
This patch reverts this error, and the following patches add the two upstream commits.
Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c index b289b90ae6dc..8de16016b6de 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -631,11 +631,8 @@ static void cxgbit_send_halfclose(struct cxgbit_sock *csk)
static void cxgbit_arp_failure_discard(void *handle, struct sk_buff *skb) { - struct cxgbit_sock *csk = handle; - pr_debug("%s cxgbit_device %p\n", __func__, handle); kfree_skb(skb); - cxgbit_put_csk(csk); }
static void cxgbit_abort_arp_failure(void *handle, struct sk_buff *skb) @@ -1193,7 +1190,7 @@ cxgbit_pass_accept_rpl(struct cxgbit_sock *csk, struct cpl_pass_accept_req *req) rpl5->opt0 = cpu_to_be64(opt0); rpl5->opt2 = cpu_to_be32(opt2); set_wr_txq(skb, CPL_PRIORITY_SETUP, csk->ctrlq_idx); - t4_set_arp_err_handler(skb, csk, cxgbit_arp_failure_discard); + t4_set_arp_err_handler(skb, NULL, cxgbit_arp_failure_discard); cxgbit_l2t_send(csk->com.cdev, skb, csk->l2t); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 801df68d617e3cb831f531c99fa6003620e6b343 ]
csk leak can happen if a new TCP connection gets established after cxgbit_accept_np() returns, to fix this leak free remaining csk in cxgbit_free_np().
Signed-off-by: Varun Prakash varun@chelsio.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c index 8de16016b6de..71888b979ab5 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -598,9 +598,12 @@ static void cxgbit_free_cdev_np(struct cxgbit_np *cnp) mutex_unlock(&cdev_list_lock); }
+static void __cxgbit_free_conn(struct cxgbit_sock *csk); + void cxgbit_free_np(struct iscsi_np *np) { struct cxgbit_np *cnp = np->np_context; + struct cxgbit_sock *csk, *tmp;
cnp->com.state = CSK_STATE_DEAD; if (cnp->com.cdev) @@ -608,6 +611,13 @@ void cxgbit_free_np(struct iscsi_np *np) else cxgbit_free_all_np(cnp);
+ spin_lock_bh(&cnp->np_accept_lock); + list_for_each_entry_safe(csk, tmp, &cnp->np_accept_list, accept_node) { + list_del_init(&csk->accept_node); + __cxgbit_free_conn(csk); + } + spin_unlock_bh(&cnp->np_accept_lock); + np->np_context = NULL; cxgbit_put_cnp(cnp); } @@ -705,9 +715,9 @@ void cxgbit_abort_conn(struct cxgbit_sock *csk) csk->tid, 600, __func__); }
-void cxgbit_free_conn(struct iscsi_conn *conn) +static void __cxgbit_free_conn(struct cxgbit_sock *csk) { - struct cxgbit_sock *csk = conn->context; + struct iscsi_conn *conn = csk->conn; bool release = false;
pr_debug("%s: state %d\n", @@ -716,7 +726,7 @@ void cxgbit_free_conn(struct iscsi_conn *conn) spin_lock_bh(&csk->lock); switch (csk->com.state) { case CSK_STATE_ESTABLISHED: - if (conn->conn_state == TARG_CONN_STATE_IN_LOGOUT) { + if (conn && (conn->conn_state == TARG_CONN_STATE_IN_LOGOUT)) { csk->com.state = CSK_STATE_CLOSING; cxgbit_send_halfclose(csk); } else { @@ -741,6 +751,11 @@ void cxgbit_free_conn(struct iscsi_conn *conn) cxgbit_put_csk(csk); }
+void cxgbit_free_conn(struct iscsi_conn *conn) +{ + __cxgbit_free_conn(conn->context); +} + static void cxgbit_set_emss(struct cxgbit_sock *csk, u16 opt) { csk->emss = csk->com.cdev->lldi.mtus[TCPOPT_MSS_G(opt)] - @@ -803,6 +818,7 @@ void _cxgbit_free_csk(struct kref *kref) spin_unlock_bh(&cdev->cskq.lock);
cxgbit_free_skb(csk); + cxgbit_put_cnp(csk->cnp); cxgbit_put_cdev(cdev);
kfree(csk); @@ -1351,6 +1367,7 @@ cxgbit_pass_accept_req(struct cxgbit_device *cdev, struct sk_buff *skb) goto rel_skb; }
+ cxgbit_get_cnp(cnp); cxgbit_get_cdev(cdev);
spin_lock(&cdev->cskq.lock);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ed076c55b359cc9982ca8b065bcc01675f7365f6 ]
In case of arp failure call cxgbit_put_csk() to free csk.
Signed-off-by: Varun Prakash varun@chelsio.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c index 71888b979ab5..b19c960d5490 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -641,8 +641,11 @@ static void cxgbit_send_halfclose(struct cxgbit_sock *csk)
static void cxgbit_arp_failure_discard(void *handle, struct sk_buff *skb) { + struct cxgbit_sock *csk = handle; + pr_debug("%s cxgbit_device %p\n", __func__, handle); kfree_skb(skb); + cxgbit_put_csk(csk); }
static void cxgbit_abort_arp_failure(void *handle, struct sk_buff *skb) @@ -1206,7 +1209,7 @@ cxgbit_pass_accept_rpl(struct cxgbit_sock *csk, struct cpl_pass_accept_req *req) rpl5->opt0 = cpu_to_be64(opt0); rpl5->opt2 = cpu_to_be32(opt2); set_wr_txq(skb, CPL_PRIORITY_SETUP, csk->ctrlq_idx); - t4_set_arp_err_handler(skb, NULL, cxgbit_arp_failure_discard); + t4_set_arp_err_handler(skb, csk, cxgbit_arp_failure_discard); cxgbit_l2t_send(csk->com.cdev, skb, csk->l2t); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 4eaed6aa2c628101246bcabc91b203bfac1193f8 ]
In KVM we define the configuration of HCR_EL2 for a VHE HOST in HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the non-VHE host flags, and open-code HCR_RW. Further, in head.S we open-code the flags for VHE and non-VHE configurations.
In future, we're going to want to configure more flags for the host, so lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S.
We now use mov_q to generate the HCR_EL2 value, as we use when configuring other registers in head.S.
Reviewed-by: Marc Zyngier marc.zyngier@arm.com Reviewed-by: Richard Henderson richard.henderson@linaro.org Signed-off-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Kristina Martsenko kristina.martsenko@arm.com Reviewed-by: Christoffer Dall christoffer.dall@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Cc: Will Deacon will.deacon@arm.com Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kernel/head.S | 5 ++--- arch/arm64/kvm/hyp/switch.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 95e3fa7ded8b..2297284e33f2 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -87,6 +87,7 @@ HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \ HCR_FMO | HCR_IMO) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) +#define HCR_HOST_NVHE_FLAGS (HCR_RW) #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
/* TCR_EL2 Registers bits */ diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index b0853069702f..651a06b1980f 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -494,10 +494,9 @@ ENTRY(el2_setup) #endif
/* Hyp configuration. */ - mov x0, #HCR_RW // 64-bit EL1 + mov_q x0, HCR_HOST_NVHE_FLAGS cbz x2, set_hcr - orr x0, x0, #HCR_TGE // Enable Host Extensions - orr x0, x0, #HCR_E2H + mov_q x0, HCR_HOST_VHE_FLAGS set_hcr: msr hcr_el2, x0 isb diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index ca46153d7915..a1c32c1f2267 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -157,7 +157,7 @@ static void __hyp_text __deactivate_traps_nvhe(void) mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
write_sysreg(mdcr_el2, mdcr_el2); - write_sysreg(HCR_RW, hcr_el2); + write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2); write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit b3669b1e1c09890d61109a1a8ece2c5b66804714 ]
To allow EL0 (and/or EL1) to use pointer authentication functionality, we must ensure that pointer authentication instructions and accesses to pointer authentication keys are not trapped to EL2.
This patch ensures that HCR_EL2 is configured appropriately when the kernel is booted at EL2. For non-VHE kernels we set HCR_EL2.{API,APK}, ensuring that EL1 can access keys and permit EL0 use of instructions. For VHE kernels host EL0 (TGE && E2H) is unaffected by these settings, and it doesn't matter how we configure HCR_EL2.{API,APK}, so we don't bother setting them.
This does not enable support for KVM guests, since KVM manages HCR_EL2 itself when running VMs.
Reviewed-by: Richard Henderson richard.henderson@linaro.org Signed-off-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Kristina Martsenko kristina.martsenko@arm.com Acked-by: Christoffer Dall christoffer.dall@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Cc: Will Deacon will.deacon@arm.com Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_arm.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 2297284e33f2..8b284cbf8162 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -24,6 +24,8 @@
/* Hyp Configuration Register (HCR) bits */ #define HCR_FWB (UL(1) << 46) +#define HCR_API (UL(1) << 41) +#define HCR_APK (UL(1) << 40) #define HCR_TEA (UL(1) << 37) #define HCR_TERR (UL(1) << 36) #define HCR_TLOR (UL(1) << 35) @@ -87,7 +89,7 @@ HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \ HCR_FMO | HCR_IMO) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) -#define HCR_HOST_NVHE_FLAGS (HCR_RW) +#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK) #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
/* TCR_EL2 Registers bits */
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7d033c9f6a7fd3821af75620a0257db87c2b552a ]
This patch makes sure the flow label in the IPv6 header forged in ipv6_local_error() is initialized.
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675 kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601 _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 copy_to_user include/linux/uaccess.h:177 [inline] move_addr_to_user+0x2e9/0x4f0 net/socket.c:227 ___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284 __sys_recvmsg net/socket.c:2327 [inline] __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x457ec9 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9 RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4 R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffff
Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_save_stack mm/kmsan/kmsan.c:219 [inline] kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439 __msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200 ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475 udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335 inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830 sock_recvmsg_nosec net/socket.c:794 [inline] sock_recvmsg+0x1d1/0x230 net/socket.c:801 ___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278 __sys_recvmsg net/socket.c:2327 [inline] __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2759 [inline] __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383 __kmalloc_reserve net/core/skbuff.c:137 [inline] __alloc_skb+0x309/0xa20 net/core/skbuff.c:205 alloc_skb include/linux/skbuff.h:998 [inline] ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334 __ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311 ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775 udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384 inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] __sys_sendto+0x8c4/0xac0 net/socket.c:1788 __do_sys_sendto net/socket.c:1800 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:1796 __x64_sys_sendto+0x6e/0x90 net/socket.c:1796 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
Bytes 4-7 of 28 are uninitialized Memory access of size 28 starts at ffff8881937bfce0 Data copied to user address 0000000020000000
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/datagram.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -341,6 +341,7 @@ void ipv6_local_error(struct sock *sk, i skb_reset_network_header(skb); iph = ipv6_hdr(skb); iph->daddr = fl6->daddr; + ip6_flow_hdr(iph, 0, 0);
serr = SKB_EXT_ERR(skb); serr->ee.ee_errno = err;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: JianJhen Chen kchen@synology.com
[ Upstream commit 4c84edc11b76590859b1e45dd676074c59602dc4 ]
When handling DNAT'ed packets on a bridge device, the neighbour cache entry from lookup was used without checking its state. It means that a cache entry in the NUD_STALE state will be used directly instead of entering the NUD_DELAY state to confirm the reachability of the neighbor.
This problem becomes worse after commit 2724680bceee ("neigh: Keep neighbour cache entries if number of them is small enough."), since all neighbour cache entries in the NUD_STALE state will be kept in the neighbour table as long as the number of cache entries does not exceed the value specified in gc_thresh1.
This commit validates the state of a neighbour cache entry before using the entry.
Signed-off-by: JianJhen Chen kchen@synology.com Reviewed-by: JinLin Chen jlchen@synology.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_netfilter_hooks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -278,7 +278,7 @@ int br_nf_pre_routing_finish_bridge(stru struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb); int ret;
- if (neigh->hh.hh_len) { + if ((neigh->nud_state & NUD_CONNECTED) && neigh->hh.hh_len) { neigh_hh_bridge(&neigh->hh, skb); skb->dev = nf_bridge->physindev; ret = br_handle_frame_finish(net, sk, skb);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Gunthorpe jgg@mellanox.com
[ Upstream commit d972f3dce8d161e2142da0ab1ef25df00e2f21a9 ]
'dev' is non NULL when the addr_len check triggers so it must goto a label that does the dev_put otherwise dev will have a leaked refcount.
This bug causes the ib_ipoib module to become unloadable when using systemd-network as it triggers this check on InfiniBand links.
Fixes: 99137b7888f4 ("packet: validate address length") Reported-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/packet/af_packet.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2628,7 +2628,7 @@ static int tpacket_snd(struct packet_soc addr = saddr->sll_halen ? saddr->sll_addr : NULL; dev = dev_get_by_index(sock_net(&po->sk), saddr->sll_ifindex); if (addr && dev && saddr->sll_halen < dev->addr_len) - goto out; + goto out_put; }
err = -ENXIO; @@ -2828,7 +2828,7 @@ static int packet_snd(struct socket *soc addr = saddr->sll_halen ? saddr->sll_addr : NULL; dev = dev_get_by_index(sock_net(sk), saddr->sll_ifindex); if (addr && dev && saddr->sll_halen < dev->addr_len) - goto out; + goto out_unlock; }
err = -ENXIO;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuchung Cheng ycheng@google.com
[ Upstream commit c5715b8fabfca0ef85903f8bad2189940ed41cc8 ]
Previously upon SYN timeouts the sender recomputes the txhash to try a different path. However this does not apply on the initial timeout of SYN-data (active Fast Open). Therefore an active IPv6 Fast Open connection may incur one second RTO penalty to take on a new path after the second SYN retransmission uses a new flow label.
This patch removes this undesirable behavior so Fast Open changes the flow label just like the regular connections. This also helps avoid falsely disabling Fast Open on the sender which triggers after two consecutive SYN timeouts on Fast Open.
Signed-off-by: Yuchung Cheng ycheng@google.com Reviewed-by: Neal Cardwell ncardwell@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -224,7 +224,7 @@ static int tcp_write_timeout(struct sock if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) { if (icsk->icsk_retransmits) { dst_negative_advice(sk); - } else if (!tp->syn_data && !tp->syn_fastopen) { + } else { sk_rethink_txhash(sk); } retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanislav Fomichev sdf@google.com
[ Upstream commit 0b7959b6257322f7693b08a459c505d4938646f2 ]
BUG: unable to handle kernel NULL pointer dereference at 00000000000000d1 Call Trace: ? napi_gro_frags+0xa7/0x2c0 tun_get_user+0xb50/0xf20 tun_chr_write_iter+0x53/0x70 new_sync_write+0xff/0x160 vfs_write+0x191/0x1e0 __x64_sys_write+0x5e/0xd0 do_syscall_64+0x47/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9
I think there is a subtle race between sending a packet via tap and attaching it:
CPU0: CPU1: tun_chr_ioctl(TUNSETIFF) tun_set_iff tun_attach rcu_assign_pointer(tfile->tun, tun); tun_fops->write_iter() tun_chr_write_iter tun_napi_alloc_frags napi_get_frags napi->skb = napi_alloc_skb tun_napi_init netif_napi_add napi->skb = NULL napi->skb is NULL here napi_gro_frags napi_frags_skb skb = napi->skb skb_reset_mac_header(skb) panic()
Move rcu_assign_pointer(tfile->tun) and rcu_assign_pointer(tun->tfiles) to be the last thing we do in tun_attach(); this should guarantee that when we call tun_get() we always get an initialized object.
v2 changes: * remove extra napi_mutex locks/unlocks for napi operations
Reported-by: syzbot syzkaller@googlegroups.com Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Stanislav Fomichev sdf@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/tun.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -859,10 +859,6 @@ static int tun_attach(struct tun_struct err = 0; }
- rcu_assign_pointer(tfile->tun, tun); - rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile); - tun->numqueues++; - if (tfile->detached) { tun_enable_queue(tfile); } else { @@ -876,6 +872,13 @@ static int tun_attach(struct tun_struct * refcnt. */
+ /* Publish tfile->tun and tun->tfiles only after we've fully + * initialized tfile; otherwise we risk using half-initialized + * object. + */ + rcu_assign_pointer(tfile->tun, tun); + rcu_assign_pointer(tun->tfiles[tun->numqueues], tfile); + tun->numqueues++; out: return err; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan Whitehead Bryan.Whitehead@microchip.com
[ Upstream commit a0071840d2040ea1b27e5a008182b09b88defc15 ]
It has been noticed that some phys do not have the registers required by the previous implementation.
To fix this, instead of using phy_read, the required information is extracted from the phy_device structure.
fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead Bryan.Whitehead@microchip.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/microchip/lan743x_main.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -962,13 +962,10 @@ static void lan743x_phy_link_status_chan
memset(&ksettings, 0, sizeof(ksettings)); phy_ethtool_get_link_ksettings(netdev, &ksettings); - local_advertisement = phy_read(phydev, MII_ADVERTISE); - if (local_advertisement < 0) - return; - - remote_advertisement = phy_read(phydev, MII_LPA); - if (remote_advertisement < 0) - return; + local_advertisement = + ethtool_adv_to_mii_adv_t(phydev->advertising); + remote_advertisement = + ethtool_adv_to_mii_adv_t(phydev->lp_advertising);
lan743x_phy_update_flowcontrol(adapter, ksettings.base.duplex,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang xiyou.wangcong@gmail.com
[ Upstream commit 26d92e951fe0a44ee4aec157cabb65a818cc8151 ]
In smc_release() we release smc->clcsock before unhash the smc sock, but a parallel smc_diag_dump() may be still reading smc->clcsock, therefore this could cause a use-after-free as reported by syzbot.
Reported-and-tested-by: syzbot+fbd1e5476e4c94c7b34e@syzkaller.appspotmail.com Fixes: 51f1de79ad8e ("net/smc: replace sock_put worker by socket refcounting") Cc: Ursula Braun ubraun@linux.ibm.com Signed-off-by: Cong Wang xiyou.wangcong@gmail.com Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/smc/af_smc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -144,6 +144,9 @@ static int smc_release(struct socket *so sock_set_flag(sk, SOCK_DEAD); sk->sk_shutdown |= SHUTDOWN_MASK; } + + sk->sk_prot->unhash(sk); + if (smc->clcsock) { if (smc->use_fallback && sk->sk_state == SMC_LISTEN) { /* wake up clcsock accept */ @@ -168,7 +171,6 @@ static int smc_release(struct socket *so smc_conn_free(&smc->conn); release_sock(sk);
- sk->sk_prot->unhash(sk); sock_put(sk); /* final sock_put */ out: return rc;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 10262b0b53666cbc506989b17a3ead1e9c3b43b4 ]
Avoid log spam caused by trying to read counters from the chip whilst it is in a PCI power-save state.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=107421
Fixes: 1ef7286e7f36 ("r8169: Dereference MMIO address immediately before use") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -1730,11 +1730,13 @@ static bool rtl8169_reset_counters(struc
static bool rtl8169_update_counters(struct rtl8169_private *tp) { + u8 val = RTL_R8(tp, ChipCmd); + /* * Some chips are unable to dump tally counters when the receiver - * is disabled. + * is disabled. If 0xff chip may be in a PCI power-save state. */ - if ((RTL_R8(tp, ChipCmd) & CmdRxEnb) == 0) + if (!(val & CmdRxEnb) || val == 0xff) return true;
return rtl8169_do_counters(tp, CounterDump);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit 001e465f09a18857443489a57e74314a3368c805 ]
A network device stack with multiple layers of bonding devices can trigger a false positive lockdep warning. Adding lockdep nest levels fixes this. Update the level on both enslave and unlink, to avoid the following series of events ..
ip netns add test ip netns exec test bash ip link set dev lo addr 00:11:22:33:44:55 ip link set dev lo down
ip link add dev bond1 type bond ip link add dev bond2 type bond
ip link set dev lo master bond1 ip link set dev bond1 master bond2
ip link set dev bond1 nomaster ip link set dev bond2 master bond1
.. from still generating a splat:
[ 193.652127] ====================================================== [ 193.658231] WARNING: possible circular locking dependency detected [ 193.664350] 4.20.0 #8 Not tainted [ 193.668310] ------------------------------------------------------ [ 193.674417] ip/15577 is trying to acquire lock: [ 193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290 [ 193.687851] but task is already holding lock: [ 193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290
[..]
[ 193.851092] lock_acquire+0xa7/0x190 [ 193.855138] _raw_spin_lock_nested+0x2d/0x40 [ 193.859878] bond_get_stats+0x58/0x290 [ 193.864093] dev_get_stats+0x5a/0xc0 [ 193.868140] bond_get_stats+0x105/0x290 [ 193.872444] dev_get_stats+0x5a/0xc0 [ 193.876493] rtnl_fill_stats+0x40/0x130 [ 193.880797] rtnl_fill_ifinfo+0x6c5/0xdc0 [ 193.885271] rtmsg_ifinfo_build_skb+0x86/0xe0 [ 193.890091] rtnetlink_event+0x5b/0xa0 [ 193.894320] raw_notifier_call_chain+0x43/0x60 [ 193.899225] netdev_change_features+0x50/0xa0 [ 193.904044] bond_compute_features.isra.46+0x1ab/0x270 [ 193.909640] bond_enslave+0x141d/0x15b0 [ 193.913946] do_set_master+0x89/0xa0 [ 193.918016] do_setlink+0x37c/0xda0 [ 193.921980] __rtnl_newlink+0x499/0x890 [ 193.926281] rtnl_newlink+0x48/0x70 [ 193.930238] rtnetlink_rcv_msg+0x171/0x4b0 [ 193.934801] netlink_rcv_skb+0xd1/0x110 [ 193.939103] rtnetlink_rcv+0x15/0x20 [ 193.943151] netlink_unicast+0x3b5/0x520 [ 193.947544] netlink_sendmsg+0x2fd/0x3f0 [ 193.951942] sock_sendmsg+0x38/0x50 [ 193.955899] ___sys_sendmsg+0x2ba/0x2d0 [ 193.960205] __x64_sys_sendmsg+0xad/0x100 [ 193.964687] do_syscall_64+0x5a/0x460 [ 193.968823] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 7e2556e40026 ("bonding: avoid lockdep confusion in bond_get_stats()") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1947,6 +1947,9 @@ static int __bond_release_one(struct net if (!bond_has_slaves(bond)) { bond_set_carrier(bond); eth_hw_addr_random(bond_dev); + bond->nest_level = SINGLE_DEPTH_NESTING; + } else { + bond->nest_level = dev_get_nest_level(bond_dev) + 1; }
unblock_netpoll_tx();
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit 4a06fa67c4da20148803525151845276cdb995c1 ]
Commit 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") avoided a read beyond the end of the skb linear segment by calling pskb_may_pull.
That function can trigger a BUG_ON in pskb_expand_head if the skb is shared, which it is when when peeking. It can also return ENOMEM.
Avoid both by switching to safer skb_header_pointer.
Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") Reported-by: syzbot syzkaller@googlegroups.com Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_sockglue.c | 12 +++++------- net/ipv6/datagram.c | 10 ++++------ 2 files changed, 9 insertions(+), 13 deletions(-)
--- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -148,19 +148,17 @@ static void ip_cmsg_recv_security(struct
static void ip_cmsg_recv_dstaddr(struct msghdr *msg, struct sk_buff *skb) { + __be16 _ports[2], *ports; struct sockaddr_in sin; - __be16 *ports; - int end; - - end = skb_transport_offset(skb) + 4; - if (end > 0 && !pskb_may_pull(skb, end)) - return;
/* All current transport protocols have the port numbers in the * first four bytes of the transport header and this function is * written with this assumption in mind. */ - ports = (__be16 *)skb_transport_header(skb); + ports = skb_header_pointer(skb, skb_transport_offset(skb), + sizeof(_ports), &_ports); + if (!ports) + return;
sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip_hdr(skb)->daddr; --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -701,17 +701,15 @@ void ip6_datagram_recv_specific_ctl(stru } if (np->rxopt.bits.rxorigdstaddr) { struct sockaddr_in6 sin6; - __be16 *ports; - int end; + __be16 _ports[2], *ports;
- end = skb_transport_offset(skb) + 4; - if (end <= 0 || pskb_may_pull(skb, end)) { + ports = skb_header_pointer(skb, skb_transport_offset(skb), + sizeof(_ports), &_ports); + if (ports) { /* All current transport protocols have the port numbers in the * first four bytes of the transport header and this function is * written with this assumption in mind. */ - ports = (__be16 *)skb_transport_header(skb); - sin6.sin6_family = AF_INET6; sin6.sin6_addr = ipv6_hdr(skb)->daddr; sin6.sin6_port = ports[1];
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
[ Upstream commit 11287b693d03830010356339e4ceddf47dee34fa ]
This soft dependency works around an issue where sometimes the genphy driver is used instead of the dedicated PHY driver. The root cause of the issue isn't clear yet. People reported the unloading/re-loading module r8169 helps, and also configuring this soft dependency in the modprobe config files. Important just seems to be that the realtek module is loaded before r8169.
Once this has been applied preliminary fix 38af4b903210 ("net: phy: add workaround for issue where PHY driver doesn't bind to the device") will be removed.
Fixes: f1e911d5d0df ("r8169: add basic phylib support") Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -717,6 +717,7 @@ module_param(use_dac, int, 0); MODULE_PARM_DESC(use_dac, "Enable PCI DAC. Unsafe on 32 bit PCI slot."); module_param_named(debug, debug.msg_enable, int, 0); MODULE_PARM_DESC(debug, "Debug verbosity level (0=none, ..., 16=all)"); +MODULE_SOFTDEP("pre: realtek"); MODULE_LICENSE("GPL"); MODULE_FIRMWARE(FIRMWARE_8168D_1); MODULE_FIRMWARE(FIRMWARE_8168D_2);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit d45a90cb5d061fa7d411b974b950fe0b8bc5f265 upstream.
sm3_compress() calls rol32() with shift >= 32, which causes undefined behavior. This is easily detected by enabling CONFIG_UBSAN.
Explicitly AND with 31 to make the behavior well defined.
Fixes: 4f0fc1600edb ("crypto: sm3 - add OSCCA SM3 secure hash") Cc: stable@vger.kernel.org # v4.15+ Cc: Gilad Ben-Yossef gilad@benyossef.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/sm3_generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/crypto/sm3_generic.c +++ b/crypto/sm3_generic.c @@ -100,7 +100,7 @@ static void sm3_compress(u32 *w, u32 *wt
for (i = 0; i <= 63; i++) {
- ss1 = rol32((rol32(a, 12) + e + rol32(t(i), i)), 7); + ss1 = rol32((rol32(a, 12) + e + rol32(t(i), i & 31)), 7);
ss2 = ss1 ^ rol32(a, 12);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aymen Sghaier aymen.sghaier@nxp.com
commit 04e6d25c5bb244c1a37eb9fe0b604cc11a04e8c5 upstream.
Recent changes - probably DMA API related (generic and/or arm64-specific) - exposed a case where driver maps a zero-length buffer: ahash_init()->ahash_update()->ahash_final() with a zero-length string to hash
kernel BUG at kernel/dma/swiotlb.c:475! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 2 PID: 1823 Comm: cryptomgr_test Not tainted 4.20.0-rc1-00108-g00c9fe37a7f2 #1 Hardware name: LS1046A RDB Board (DT) pstate: 80000005 (Nzcv daif -PAN -UAO) pc : swiotlb_tbl_map_single+0x170/0x2b8 lr : swiotlb_map_page+0x134/0x1f8 sp : ffff00000f79b8f0 x29: ffff00000f79b8f0 x28: 0000000000000000 x27: ffff0000093d0000 x26: 0000000000000000 x25: 00000000001f3ffe x24: 0000000000200000 x23: 0000000000000000 x22: 00000009f2c538c0 x21: ffff800970aeb410 x20: 0000000000000001 x19: ffff800970aeb410 x18: 0000000000000007 x17: 000000000000000e x16: 0000000000000001 x15: 0000000000000019 x14: c32cb8218a167fe8 x13: ffffffff00000000 x12: ffff80097fdae348 x11: 0000800976bca000 x10: 0000000000000010 x9 : 0000000000000000 x8 : ffff0000091fd6c8 x7 : 0000000000000000 x6 : 00000009f2c538bf x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 00000009f2c538c0 x1 : 00000000f9fff000 x0 : 0000000000000000 Process cryptomgr_test (pid: 1823, stack limit = 0x(____ptrval____)) Call trace: swiotlb_tbl_map_single+0x170/0x2b8 swiotlb_map_page+0x134/0x1f8 ahash_final_no_ctx+0xc4/0x6cc ahash_final+0x10/0x18 crypto_ahash_op+0x30/0x84 crypto_ahash_final+0x14/0x1c __test_hash+0x574/0xe0c test_hash+0x28/0x80 __alg_test_hash+0x84/0xd0 alg_test_hash+0x78/0x144 alg_test.part.30+0x12c/0x2b4 alg_test+0x3c/0x68 cryptomgr_test+0x44/0x4c kthread+0xfc/0x128 ret_from_fork+0x10/0x18 Code: d34bfc18 2a1a03f7 1a9f8694 35fff89a (d4210000)
Cc: stable@vger.kernel.org Signed-off-by: Aymen Sghaier aymen.sghaier@nxp.com Signed-off-by: Horia Geantă horia.geanta@nxp.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/caam/caamhash.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1131,13 +1131,16 @@ static int ahash_final_no_ctx(struct aha
desc = edesc->hw_desc;
- state->buf_dma = dma_map_single(jrdev, buf, buflen, DMA_TO_DEVICE); - if (dma_mapping_error(jrdev, state->buf_dma)) { - dev_err(jrdev, "unable to map src\n"); - goto unmap; - } + if (buflen) { + state->buf_dma = dma_map_single(jrdev, buf, buflen, + DMA_TO_DEVICE); + if (dma_mapping_error(jrdev, state->buf_dma)) { + dev_err(jrdev, "unable to map src\n"); + goto unmap; + }
- append_seq_in_ptr(desc, state->buf_dma, buflen, 0); + append_seq_in_ptr(desc, state->buf_dma, buflen, 0); + }
edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, digestsize);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harsh Jain harsh@chelsio.com
commit a7773363624b034ab198c738661253d20a8055c2 upstream.
Authencesn template in decrypt path unconditionally calls aead_request_complete after ahash_verify which leads to following kernel panic in after decryption.
[ 338.539800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 338.548372] PGD 0 P4D 0 [ 338.551157] Oops: 0000 [#1] SMP PTI [ 338.554919] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G W I 4.19.7+ #13 [ 338.564431] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 338.572212] RIP: 0010:esp_input_done2+0x350/0x410 [esp4] [ 338.578030] Code: ff 0f b6 68 10 48 8b 83 c8 00 00 00 e9 8e fe ff ff 8b 04 25 04 00 00 00 83 e8 01 48 98 48 8b 3c c5 10 00 00 00 e9 f7 fd ff ff <8b> 04 25 04 00 00 00 83 e8 01 48 98 4c 8b 24 c5 10 00 00 00 e9 3b [ 338.598547] RSP: 0018:ffff911c97803c00 EFLAGS: 00010246 [ 338.604268] RAX: 0000000000000002 RBX: ffff911c4469ee00 RCX: 0000000000000000 [ 338.612090] RDX: 0000000000000000 RSI: 0000000000000130 RDI: ffff911b87c20400 [ 338.619874] RBP: 0000000000000000 R08: ffff911b87c20498 R09: 000000000000000a [ 338.627610] R10: 0000000000000001 R11: 0000000000000004 R12: 0000000000000000 [ 338.635402] R13: ffff911c89590000 R14: ffff911c91730000 R15: 0000000000000000 [ 338.643234] FS: 0000000000000000(0000) GS:ffff911c97800000(0000) knlGS:0000000000000000 [ 338.652047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 338.658299] CR2: 0000000000000004 CR3: 00000001ec20a000 CR4: 00000000000006f0 [ 338.666382] Call Trace: [ 338.669051] <IRQ> [ 338.671254] esp_input_done+0x12/0x20 [esp4] [ 338.675922] chcr_handle_resp+0x3b5/0x790 [chcr] [ 338.680949] cpl_fw6_pld_handler+0x37/0x60 [chcr] [ 338.686080] chcr_uld_rx_handler+0x22/0x50 [chcr] [ 338.691233] uldrx_handler+0x8c/0xc0 [cxgb4] [ 338.695923] process_responses+0x2f0/0x5d0 [cxgb4] [ 338.701177] ? bitmap_find_next_zero_area_off+0x3a/0x90 [ 338.706882] ? matrix_alloc_area.constprop.7+0x60/0x90 [ 338.712517] ? apic_update_irq_cfg+0x82/0xf0 [ 338.717177] napi_rx_handler+0x14/0xe0 [cxgb4] [ 338.722015] net_rx_action+0x2aa/0x3e0 [ 338.726136] __do_softirq+0xcb/0x280 [ 338.730054] irq_exit+0xde/0xf0 [ 338.733504] do_IRQ+0x54/0xd0 [ 338.736745] common_interrupt+0xf/0xf
Fixes: 104880a6b470 ("crypto: authencesn - Convert to new AEAD...") Signed-off-by: Harsh Jain harsh@chelsio.com Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/authencesn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/crypto/authencesn.c +++ b/crypto/authencesn.c @@ -279,7 +279,7 @@ static void authenc_esn_verify_ahash_don struct aead_request *req = areq->data;
err = err ?: crypto_authenc_esn_decrypt_tail(req, 0); - aead_request_complete(req, err); + authenc_esn_request_complete(req, err); }
static int crypto_authenc_esn_decrypt(struct aead_request *req)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit dc95b5350a8f07d73d6bde3a79ef87289698451d upstream.
Convert the ccree crypto driver to use crypto_authenc_extractkeys() so that it picks up the fix for broken validation of rtattr::rta_len.
Fixes: ff27e85a85bb ("crypto: ccree - add AEAD support") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/ccree/cc_aead.c | 40 +++++++++++++++++++--------------------- 1 file changed, 19 insertions(+), 21 deletions(-)
--- a/drivers/crypto/ccree/cc_aead.c +++ b/drivers/crypto/ccree/cc_aead.c @@ -540,13 +540,12 @@ static int cc_aead_setkey(struct crypto_ unsigned int keylen) { struct cc_aead_ctx *ctx = crypto_aead_ctx(tfm); - struct rtattr *rta = (struct rtattr *)key; struct cc_crypto_req cc_req = {}; - struct crypto_authenc_key_param *param; struct cc_hw_desc desc[MAX_AEAD_SETKEY_SEQ]; - int rc = -EINVAL; unsigned int seq_len = 0; struct device *dev = drvdata_to_dev(ctx->drvdata); + const u8 *enckey, *authkey; + int rc;
dev_dbg(dev, "Setting key in context @%p for %s. key=%p keylen=%u\n", ctx, crypto_tfm_alg_name(crypto_aead_tfm(tfm)), key, keylen); @@ -554,35 +553,33 @@ static int cc_aead_setkey(struct crypto_ /* STAT_PHASE_0: Init and sanity checks */
if (ctx->auth_mode != DRV_HASH_NULL) { /* authenc() alg. */ - if (!RTA_OK(rta, keylen)) - goto badkey; - if (rta->rta_type != CRYPTO_AUTHENC_KEYA_PARAM) - goto badkey; - if (RTA_PAYLOAD(rta) < sizeof(*param)) - goto badkey; - param = RTA_DATA(rta); - ctx->enc_keylen = be32_to_cpu(param->enckeylen); - key += RTA_ALIGN(rta->rta_len); - keylen -= RTA_ALIGN(rta->rta_len); - if (keylen < ctx->enc_keylen) + struct crypto_authenc_keys keys; + + rc = crypto_authenc_extractkeys(&keys, key, keylen); + if (rc) goto badkey; - ctx->auth_keylen = keylen - ctx->enc_keylen; + enckey = keys.enckey; + authkey = keys.authkey; + ctx->enc_keylen = keys.enckeylen; + ctx->auth_keylen = keys.authkeylen;
if (ctx->cipher_mode == DRV_CIPHER_CTR) { /* the nonce is stored in bytes at end of key */ + rc = -EINVAL; if (ctx->enc_keylen < (AES_MIN_KEY_SIZE + CTR_RFC3686_NONCE_SIZE)) goto badkey; /* Copy nonce from last 4 bytes in CTR key to * first 4 bytes in CTR IV */ - memcpy(ctx->ctr_nonce, key + ctx->auth_keylen + - ctx->enc_keylen - CTR_RFC3686_NONCE_SIZE, - CTR_RFC3686_NONCE_SIZE); + memcpy(ctx->ctr_nonce, enckey + ctx->enc_keylen - + CTR_RFC3686_NONCE_SIZE, CTR_RFC3686_NONCE_SIZE); /* Set CTR key size */ ctx->enc_keylen -= CTR_RFC3686_NONCE_SIZE; } } else { /* non-authenc - has just one key */ + enckey = key; + authkey = NULL; ctx->enc_keylen = keylen; ctx->auth_keylen = 0; } @@ -594,13 +591,14 @@ static int cc_aead_setkey(struct crypto_ /* STAT_PHASE_1: Copy key to ctx */
/* Get key material */ - memcpy(ctx->enckey, key + ctx->auth_keylen, ctx->enc_keylen); + memcpy(ctx->enckey, enckey, ctx->enc_keylen); if (ctx->enc_keylen == 24) memset(ctx->enckey + 24, 0, CC_AES_KEY_SIZE_MAX - 24); if (ctx->auth_mode == DRV_HASH_XCBC_MAC) { - memcpy(ctx->auth_state.xcbc.xcbc_keys, key, ctx->auth_keylen); + memcpy(ctx->auth_state.xcbc.xcbc_keys, authkey, + ctx->auth_keylen); } else if (ctx->auth_mode != DRV_HASH_NULL) { /* HMAC */ - rc = cc_get_plain_hmac_key(tfm, key, ctx->auth_keylen); + rc = cc_get_plain_hmac_key(tfm, authkey, ctx->auth_keylen); if (rc) goto badkey; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit ab57b33525c3221afaebd391458fa0cbcd56903d upstream.
Convert the bcm crypto driver to use crypto_authenc_extractkeys() so that it picks up the fix for broken validation of rtattr::rta_len.
This also fixes the DES weak key check to actually be done on the right key. (It was checking the authentication key, not the encryption key...)
Fixes: 9d12ba86f818 ("crypto: brcm - Add Broadcom SPU driver") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/Kconfig | 1 + drivers/crypto/bcm/cipher.c | 44 +++++++++++++------------------------------- 2 files changed, 14 insertions(+), 31 deletions(-)
--- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -681,6 +681,7 @@ config CRYPTO_DEV_BCM_SPU depends on ARCH_BCM_IPROC depends on MAILBOX default m + select CRYPTO_AUTHENC select CRYPTO_DES select CRYPTO_MD5 select CRYPTO_SHA1 --- a/drivers/crypto/bcm/cipher.c +++ b/drivers/crypto/bcm/cipher.c @@ -2845,44 +2845,28 @@ static int aead_authenc_setkey(struct cr struct spu_hw *spu = &iproc_priv.spu; struct iproc_ctx_s *ctx = crypto_aead_ctx(cipher); struct crypto_tfm *tfm = crypto_aead_tfm(cipher); - struct rtattr *rta = (void *)key; - struct crypto_authenc_key_param *param; - const u8 *origkey = key; - const unsigned int origkeylen = keylen; - - int ret = 0; + struct crypto_authenc_keys keys; + int ret;
flow_log("%s() aead:%p key:%p keylen:%u\n", __func__, cipher, key, keylen); flow_dump(" key: ", key, keylen);
- if (!RTA_OK(rta, keylen)) - goto badkey; - if (rta->rta_type != CRYPTO_AUTHENC_KEYA_PARAM) + ret = crypto_authenc_extractkeys(&keys, key, keylen); + if (ret) goto badkey; - if (RTA_PAYLOAD(rta) < sizeof(*param)) - goto badkey; - - param = RTA_DATA(rta); - ctx->enckeylen = be32_to_cpu(param->enckeylen);
- key += RTA_ALIGN(rta->rta_len); - keylen -= RTA_ALIGN(rta->rta_len); - - if (keylen < ctx->enckeylen) - goto badkey; - if (ctx->enckeylen > MAX_KEY_SIZE) + if (keys.enckeylen > MAX_KEY_SIZE || + keys.authkeylen > MAX_KEY_SIZE) goto badkey;
- ctx->authkeylen = keylen - ctx->enckeylen; - - if (ctx->authkeylen > MAX_KEY_SIZE) - goto badkey; + ctx->enckeylen = keys.enckeylen; + ctx->authkeylen = keys.authkeylen;
- memcpy(ctx->enckey, key + ctx->authkeylen, ctx->enckeylen); + memcpy(ctx->enckey, keys.enckey, keys.enckeylen); /* May end up padding auth key. So make sure it's zeroed. */ memset(ctx->authkey, 0, sizeof(ctx->authkey)); - memcpy(ctx->authkey, key, ctx->authkeylen); + memcpy(ctx->authkey, keys.authkey, keys.authkeylen);
switch (ctx->alg->cipher_info.alg) { case CIPHER_ALG_DES: @@ -2890,7 +2874,7 @@ static int aead_authenc_setkey(struct cr u32 tmp[DES_EXPKEY_WORDS]; u32 flags = CRYPTO_TFM_RES_WEAK_KEY;
- if (des_ekey(tmp, key) == 0) { + if (des_ekey(tmp, keys.enckey) == 0) { if (crypto_aead_get_flags(cipher) & CRYPTO_TFM_REQ_WEAK_KEY) { crypto_aead_set_flags(cipher, flags); @@ -2905,7 +2889,7 @@ static int aead_authenc_setkey(struct cr break; case CIPHER_ALG_3DES: if (ctx->enckeylen == (DES_KEY_SIZE * 3)) { - const u32 *K = (const u32 *)key; + const u32 *K = (const u32 *)keys.enckey; u32 flags = CRYPTO_TFM_RES_BAD_KEY_SCHED;
if (!((K[0] ^ K[2]) | (K[1] ^ K[3])) || @@ -2956,9 +2940,7 @@ static int aead_authenc_setkey(struct cr ctx->fallback_cipher->base.crt_flags &= ~CRYPTO_TFM_REQ_MASK; ctx->fallback_cipher->base.crt_flags |= tfm->crt_flags & CRYPTO_TFM_REQ_MASK; - ret = - crypto_aead_setkey(ctx->fallback_cipher, origkey, - origkeylen); + ret = crypto_aead_setkey(ctx->fallback_cipher, key, keylen); if (ret) { flow_log(" fallback setkey() returned:%d\n", ret); tfm->crt_flags &= ~CRYPTO_TFM_RES_MASK;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 8f9c469348487844328e162db57112f7d347c49f upstream.
Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte 'enckeylen', followed by an authentication key and an encryption key. crypto_authenc_extractkeys() parses the key to find the inner keys.
However, it fails to consider the case where the rtattr's payload is longer than 4 bytes but not 4-byte aligned, and where the key ends before the next 4-byte aligned boundary. In this case, 'keylen -= RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX. This causes a buffer overread and crash during crypto_ahash_setkey().
Fix it by restricting the rtattr payload to the expected size.
Reproducer using AF_ALG:
#include <linux/if_alg.h> #include <linux/rtnetlink.h> #include <sys/socket.h>
int main() { int fd; struct sockaddr_alg addr = { .salg_type = "aead", .salg_name = "authenc(hmac(sha256),cbc(aes))", }; struct { struct rtattr attr; __be32 enckeylen; char keys[1]; } __attribute__((packed)) key = { .attr.rta_len = sizeof(key), .attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */, };
fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key)); }
It caused:
BUG: unable to handle kernel paging request at ffff88007ffdc000 PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0 Oops: 0000 [#1] SMP CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155 [...] Call Trace: sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321 crypto_shash_finup+0x1a/0x30 crypto/shash.c:178 shash_digest_unaligned+0x45/0x60 crypto/shash.c:186 crypto_shash_digest+0x24/0x40 crypto/shash.c:202 hmac_setkey+0x135/0x1e0 crypto/hmac.c:66 crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66 shash_async_setkey+0x10/0x20 crypto/shash.c:223 crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202 crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96 crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62 aead_setkey+0xc/0x10 crypto/algif_aead.c:526 alg_setkey crypto/af_alg.c:223 [inline] alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256 __sys_setsockopt+0x6d/0xd0 net/socket.c:1902 __do_sys_setsockopt net/socket.c:1913 [inline] __se_sys_setsockopt net/socket.c:1910 [inline] __x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910 do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: e236d4a89a2f ("[CRYPTO] authenc: Move enckeylen into key itself") Cc: stable@vger.kernel.org # v2.6.25+ Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/authenc.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/crypto/authenc.c +++ b/crypto/authenc.c @@ -58,14 +58,22 @@ int crypto_authenc_extractkeys(struct cr return -EINVAL; if (rta->rta_type != CRYPTO_AUTHENC_KEYA_PARAM) return -EINVAL; - if (RTA_PAYLOAD(rta) < sizeof(*param)) + + /* + * RTA_OK() didn't align the rtattr's payload when validating that it + * fits in the buffer. Yet, the keys should start on the next 4-byte + * aligned boundary. To avoid confusion, require that the rtattr + * payload be exactly the param struct, which has a 4-byte aligned size. + */ + if (RTA_PAYLOAD(rta) != sizeof(*param)) return -EINVAL; + BUILD_BUG_ON(sizeof(*param) % RTA_ALIGNTO);
param = RTA_DATA(rta); keys->enckeylen = be32_to_cpu(param->enckeylen);
- key += RTA_ALIGN(rta->rta_len); - keylen -= RTA_ALIGN(rta->rta_len); + key += rta->rta_len; + keylen -= rta->rta_len;
if (keylen < keys->enckeylen) return -EINVAL;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit c56c2e173773097a248fd3bace91ac8f6fc5386d upstream.
This patch moves the mapping of IV after the kmalloc(). This avoids having to unmap in case kmalloc() fails.
Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Reviewed-by: Horia Geantă horia.geanta@nxp.com Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/talitos.c | 25 +++++++------------------ 1 file changed, 7 insertions(+), 18 deletions(-)
--- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -1361,23 +1361,18 @@ static struct talitos_edesc *talitos_ede struct talitos_private *priv = dev_get_drvdata(dev); bool is_sec1 = has_ftr_sec1(priv); int max_len = is_sec1 ? TALITOS1_MAX_DATA_LEN : TALITOS2_MAX_DATA_LEN; - void *err;
if (cryptlen + authsize > max_len) { dev_err(dev, "length exceeds h/w max limit\n"); return ERR_PTR(-EINVAL); }
- if (ivsize) - iv_dma = dma_map_single(dev, iv, ivsize, DMA_TO_DEVICE); - if (!dst || dst == src) { src_len = assoclen + cryptlen + authsize; src_nents = sg_nents_for_len(src, src_len); if (src_nents < 0) { dev_err(dev, "Invalid number of src SG.\n"); - err = ERR_PTR(-EINVAL); - goto error_sg; + return ERR_PTR(-EINVAL); } src_nents = (src_nents == 1) ? 0 : src_nents; dst_nents = dst ? src_nents : 0; @@ -1387,16 +1382,14 @@ static struct talitos_edesc *talitos_ede src_nents = sg_nents_for_len(src, src_len); if (src_nents < 0) { dev_err(dev, "Invalid number of src SG.\n"); - err = ERR_PTR(-EINVAL); - goto error_sg; + return ERR_PTR(-EINVAL); } src_nents = (src_nents == 1) ? 0 : src_nents; dst_len = assoclen + cryptlen + (encrypt ? authsize : 0); dst_nents = sg_nents_for_len(dst, dst_len); if (dst_nents < 0) { dev_err(dev, "Invalid number of dst SG.\n"); - err = ERR_PTR(-EINVAL); - goto error_sg; + return ERR_PTR(-EINVAL); } dst_nents = (dst_nents == 1) ? 0 : dst_nents; } @@ -1425,10 +1418,10 @@ static struct talitos_edesc *talitos_ede alloc_len += sizeof(struct talitos_desc);
edesc = kmalloc(alloc_len, GFP_DMA | flags); - if (!edesc) { - err = ERR_PTR(-ENOMEM); - goto error_sg; - } + if (!edesc) + return ERR_PTR(-ENOMEM); + if (ivsize) + iv_dma = dma_map_single(dev, iv, ivsize, DMA_TO_DEVICE); memset(&edesc->desc, 0, sizeof(edesc->desc));
edesc->src_nents = src_nents; @@ -1445,10 +1438,6 @@ static struct talitos_edesc *talitos_ede DMA_BIDIRECTIONAL); } return edesc; -error_sg: - if (iv_dma) - dma_unmap_single(dev, iv_dma, ivsize, DMA_TO_DEVICE); - return err; }
static struct talitos_edesc *aead_edesc_alloc(struct aead_request *areq, u8 *iv,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 1bea445b0a022ee126ca328b3705cd4df18ebc14 upstream.
[ 2.364486] WARNING: CPU: 0 PID: 60 at ./arch/powerpc/include/asm/io.h:837 dma_nommu_map_page+0x44/0xd4 [ 2.373579] CPU: 0 PID: 60 Comm: cryptomgr_test Tainted: G W 4.20.0-rc5-00560-g6bfb52e23a00-dirty #531 [ 2.384740] NIP: c000c540 LR: c000c584 CTR: 00000000 [ 2.389743] REGS: c95abab0 TRAP: 0700 Tainted: G W (4.20.0-rc5-00560-g6bfb52e23a00-dirty) [ 2.400042] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24042204 XER: 00000000 [ 2.406669] [ 2.406669] GPR00: c02f2244 c95abb60 c6262990 c95abd80 0000256a 00000001 00000001 00000001 [ 2.406669] GPR08: 00000000 00002000 00000010 00000010 24042202 00000000 00000100 c95abd88 [ 2.406669] GPR16: 00000000 c05569d4 00000001 00000010 c95abc88 c0615664 00000004 00000000 [ 2.406669] GPR24: 00000010 c95abc88 c95abc88 00000000 c61ae210 c7ff6d40 c61ae210 00003d68 [ 2.441559] NIP [c000c540] dma_nommu_map_page+0x44/0xd4 [ 2.446720] LR [c000c584] dma_nommu_map_page+0x88/0xd4 [ 2.451762] Call Trace: [ 2.454195] [c95abb60] [82000808] 0x82000808 (unreliable) [ 2.459572] [c95abb80] [c02f2244] talitos_edesc_alloc+0xbc/0x3c8 [ 2.465493] [c95abbb0] [c02f2600] ablkcipher_edesc_alloc+0x4c/0x5c [ 2.471606] [c95abbd0] [c02f4ed0] ablkcipher_encrypt+0x20/0x64 [ 2.477389] [c95abbe0] [c02023b0] __test_skcipher+0x4bc/0xa08 [ 2.483049] [c95abe00] [c0204b60] test_skcipher+0x2c/0xcc [ 2.488385] [c95abe20] [c0204c48] alg_test_skcipher+0x48/0xbc [ 2.494064] [c95abe40] [c0205cec] alg_test+0x164/0x2e8 [ 2.499142] [c95abf00] [c0200dec] cryptomgr_test+0x48/0x50 [ 2.504558] [c95abf10] [c0039ff4] kthread+0xe4/0x110 [ 2.509471] [c95abf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c [ 2.515532] Instruction dump: [ 2.518468] 7c7e1b78 7c9d2378 7cbf2b78 41820054 3d20c076 8089c200 3d20c076 7c84e850 [ 2.526127] 8129c204 7c842e70 7f844840 419c0008 <0fe00000> 2f9e0000 54847022 7c84fa14 [ 2.533960] ---[ end trace bf78d94af73fe3b8 ]--- [ 2.539123] talitos ff020000.crypto: master data transfer error [ 2.544775] talitos ff020000.crypto: TEA error: ISR 0x20000000_00000040 [ 2.551625] alg: skcipher: encryption failed on test 1 for ecb-aes-talitos: ret=22
IV cannot be on stack when CONFIG_VMAP_STACK is selected because the stack cannot be DMA mapped anymore.
This patch copies the IV into the extended descriptor.
Fixes: 4de9d0b547b9 ("crypto: talitos - Add ablkcipher algorithms") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Reviewed-by: Horia Geantă horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/talitos.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -1416,12 +1416,15 @@ static struct talitos_edesc *talitos_ede /* if its a ahash, add space for a second desc next to the first one */ if (is_sec1 && !dst) alloc_len += sizeof(struct talitos_desc); + alloc_len += ivsize;
edesc = kmalloc(alloc_len, GFP_DMA | flags); if (!edesc) return ERR_PTR(-ENOMEM); - if (ivsize) + if (ivsize) { + iv = memcpy(((u8 *)edesc) + alloc_len - ivsize, iv, ivsize); iv_dma = dma_map_single(dev, iv, ivsize, DMA_TO_DEVICE); + } memset(&edesc->desc, 0, sizeof(edesc->desc));
edesc->src_nents = src_nents;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
commit 867cefb4cb1012f42cada1c7d1f35ac8dd276071 upstream.
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value.
[boris: replaced pr_info() with pr_info_once() in xen_callback_vector() to avoid printing with incorrect timestamp during resume (as we haven't re-adjusted the clock yet)]
Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: stable@vger.kernel.org # 4.11 Reported-by: Hans van Kranenburg hans.van.kranenburg@mendix.com Signed-off-by: Juergen Gross jgross@suse.com Tested-by: Hans van Kranenburg hans.van.kranenburg@mendix.com Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/xen/time.c | 12 +++++++++--- drivers/xen/events/events_base.c | 2 +- 2 files changed, 10 insertions(+), 4 deletions(-)
--- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -361,8 +361,6 @@ void xen_timer_resume(void) { int cpu;
- pvclock_resume(); - if (xen_clockevent != &xen_vcpuop_clockevent) return;
@@ -379,12 +377,15 @@ static const struct pv_time_ops xen_time };
static struct pvclock_vsyscall_time_info *xen_clock __read_mostly; +static u64 xen_clock_value_saved;
void xen_save_time_memory_area(void) { struct vcpu_register_time_memory_area t; int ret;
+ xen_clock_value_saved = xen_clocksource_read() - xen_sched_clock_offset; + if (!xen_clock) return;
@@ -404,7 +405,7 @@ void xen_restore_time_memory_area(void) int ret;
if (!xen_clock) - return; + goto out;
t.addr.v = &xen_clock->pvti;
@@ -421,6 +422,11 @@ void xen_restore_time_memory_area(void) if (ret != 0) pr_notice("Cannot restore secondary vcpu_time_info (err %d)", ret); + +out: + /* Need pvclock_resume() before using xen_clocksource_read(). */ + pvclock_resume(); + xen_sched_clock_offset = xen_clocksource_read() - xen_clock_value_saved; }
static void xen_setup_vsyscall_time_info(void) --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1650,7 +1650,7 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; } - pr_info("Xen HVM callback vector for event delivery is enabled\n"); + pr_info_once("Xen HVM callback vector for event delivery is enabled\n"); alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, xen_hvm_callback_vector); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Sterba dsterba@suse.com
commit 77b7aad195099e7c6da11e94b7fa6ef5e6fb0025 upstream.
This reverts commit e73e81b6d0114d4a303205a952ab2e87c44bd279.
This patch causes a few problems:
- adds latency to btrfs_finish_ordered_io - as btrfs_finish_ordered_io is used for free space cache, generating more work from btrfs_btree_balance_dirty_nodelay could end up in the same workque, effectively deadlocking
12260 kworker/u96:16+btrfs-freespace-write D [<0>] balance_dirty_pages+0x6e6/0x7ad [<0>] balance_dirty_pages_ratelimited+0x6bb/0xa90 [<0>] btrfs_finish_ordered_io+0x3da/0x770 [<0>] normal_work_helper+0x1c5/0x5a0 [<0>] process_one_work+0x1ee/0x5a0 [<0>] worker_thread+0x46/0x3d0 [<0>] kthread+0xf5/0x130 [<0>] ret_from_fork+0x24/0x30 [<0>] 0xffffffffffffffff
Transaction commit will wait on the freespace cache:
838 btrfs-transacti D [<0>] btrfs_start_ordered_extent+0x154/0x1e0 [<0>] btrfs_wait_ordered_range+0xbd/0x110 [<0>] __btrfs_wait_cache_io+0x49/0x1a0 [<0>] btrfs_write_dirty_block_groups+0x10b/0x3b0 [<0>] commit_cowonly_roots+0x215/0x2b0 [<0>] btrfs_commit_transaction+0x37e/0x910 [<0>] transaction_kthread+0x14d/0x180 [<0>] kthread+0xf5/0x130 [<0>] ret_from_fork+0x24/0x30 [<0>] 0xffffffffffffffff
And then writepages ends up waiting on transaction commit:
9520 kworker/u96:13+flush-btrfs-1 D [<0>] wait_current_trans+0xac/0xe0 [<0>] start_transaction+0x21b/0x4b0 [<0>] cow_file_range_inline+0x10b/0x6b0 [<0>] cow_file_range.isra.69+0x329/0x4a0 [<0>] run_delalloc_range+0x105/0x3c0 [<0>] writepage_delalloc+0x119/0x180 [<0>] __extent_writepage+0x10c/0x390 [<0>] extent_write_cache_pages+0x26f/0x3d0 [<0>] extent_writepages+0x4f/0x80 [<0>] do_writepages+0x17/0x60 [<0>] __writeback_single_inode+0x59/0x690 [<0>] writeback_sb_inodes+0x291/0x4e0 [<0>] __writeback_inodes_wb+0x87/0xb0 [<0>] wb_writeback+0x3bb/0x500 [<0>] wb_workfn+0x40d/0x610 [<0>] process_one_work+0x1ee/0x5a0 [<0>] worker_thread+0x1e0/0x3d0 [<0>] kthread+0xf5/0x130 [<0>] ret_from_fork+0x24/0x30 [<0>] 0xffffffffffffffff
Eventually, we have every process in the system waiting on balance_dirty_pages(), and nobody is able to make progress on page writeback.
The original patch tried to fix an OOM condition, that happened on 4.4 but no success reproducing that on later kernels (4.19 and 4.20). This is more likely a problem in OOM itself.
Link: https://lore.kernel.org/linux-btrfs/20180528054821.9092-1-ethanlien@synology... Reported-by: Chris Mason clm@fb.com CC: stable@vger.kernel.org # 4.18+ CC: ethanlien ethanlien@synology.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/inode.c | 3 --- 1 file changed, 3 deletions(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3151,9 +3151,6 @@ out: /* once for the tree */ btrfs_put_ordered_extent(ordered_extent);
- /* Try to release some metadata so we don't get an OOM but don't wait */ - btrfs_btree_balance_dirty_nodelay(fs_info); - return ret; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 74d5d229b1bf60f93bff244b2dfc0eb21ec32a07 upstream.
If we flip read-only before we initiate writeback on all dirty pages for ordered extents we've created then we'll have ordered extents left over on umount, which results in all sorts of bad things happening. Fix this by making sure we wait on ordered extents if we have to do the aborted transaction cleanup stuff.
generic/475 can produce this warning:
[ 8531.177332] WARNING: CPU: 2 PID: 11997 at fs/btrfs/disk-io.c:3856 btrfs_free_fs_root+0x95/0xa0 [btrfs] [ 8531.183282] CPU: 2 PID: 11997 Comm: umount Tainted: G W 5.0.0-rc1-default+ #394 [ 8531.185164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [ 8531.187851] RIP: 0010:btrfs_free_fs_root+0x95/0xa0 [btrfs] [ 8531.193082] RSP: 0018:ffffb1ab86163d98 EFLAGS: 00010286 [ 8531.194198] RAX: ffff9f3449494d18 RBX: ffff9f34a2695000 RCX:0000000000000000 [ 8531.195629] RDX: 0000000000000002 RSI: 0000000000000001 RDI:0000000000000000 [ 8531.197315] RBP: ffff9f344e930000 R08: 0000000000000001 R09:0000000000000000 [ 8531.199095] R10: 0000000000000000 R11: ffff9f34494d4ff8 R12:ffffb1ab86163dc0 [ 8531.200870] R13: ffff9f344e9300b0 R14: ffffb1ab86163db8 R15:0000000000000000 [ 8531.202707] FS: 00007fc68e949fc0(0000) GS:ffff9f34bd800000(0000)knlGS:0000000000000000 [ 8531.204851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8531.205942] CR2: 00007ffde8114dd8 CR3: 000000002dfbd000 CR4:00000000000006e0 [ 8531.207516] Call Trace: [ 8531.208175] btrfs_free_fs_roots+0xdb/0x170 [btrfs] [ 8531.210209] ? wait_for_completion+0x5b/0x190 [ 8531.211303] close_ctree+0x157/0x350 [btrfs] [ 8531.212412] generic_shutdown_super+0x64/0x100 [ 8531.213485] kill_anon_super+0x14/0x30 [ 8531.214430] btrfs_kill_super+0x12/0xa0 [btrfs] [ 8531.215539] deactivate_locked_super+0x29/0x60 [ 8531.216633] cleanup_mnt+0x3b/0x70 [ 8531.217497] task_work_run+0x98/0xc0 [ 8531.218397] exit_to_usermode_loop+0x83/0x90 [ 8531.219324] do_syscall_64+0x15b/0x180 [ 8531.220192] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 8531.221286] RIP: 0033:0x7fc68e5e4d07 [ 8531.225621] RSP: 002b:00007ffde8116608 EFLAGS: 00000246 ORIG_RAX:00000000000000a6 [ 8531.227512] RAX: 0000000000000000 RBX: 00005580c2175970 RCX:00007fc68e5e4d07 [ 8531.229098] RDX: 0000000000000001 RSI: 0000000000000000 RDI:00005580c2175b80 [ 8531.230730] RBP: 0000000000000000 R08: 00005580c2175ba0 R09:00007ffde8114e80 [ 8531.232269] R10: 0000000000000000 R11: 0000000000000246 R12:00005580c2175b80 [ 8531.233839] R13: 00007fc68eac61c4 R14: 00005580c2175a68 R15:0000000000000000
Leaving a tree in the rb-tree:
3853 void btrfs_free_fs_root(struct btrfs_root *root) 3854 { 3855 iput(root->ino_cache_inode); 3856 WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
CC: stable@vger.kernel.org Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Josef Bacik josef@toxicpanda.com [ add stacktrace ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/disk-io.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4155,6 +4155,14 @@ static void btrfs_destroy_all_ordered_ex spin_lock(&fs_info->ordered_root_lock); } spin_unlock(&fs_info->ordered_root_lock); + + /* + * We need this here because if we've been flipped read-only we won't + * get sync() from the umount, so we need to make sure any ordered + * extents that haven't had their dirty pages IO start writeout yet + * actually get run and error out properly. + */ + btrfs_wait_ordered_roots(fs_info, U64_MAX, 0, (u64)-1); }
static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
commit 9474f4e7cd71a633fa1ef93b7daefd44bbdfd482 upstream.
It's possible that a pid has died before we take the rcu lock, in which case we can't walk the ancestry list as it may be detached. Instead, check for death first before doing the walk.
Reported-by: syzbot+a9ac39bf55329e206219@syzkaller.appspotmail.com Fixes: 2d514487faf1 ("security: Yama LSM") Cc: stable@vger.kernel.org Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: James Morris james.morris@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/yama/yama_lsm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -368,7 +368,9 @@ static int yama_ptrace_access_check(stru break; case YAMA_SCOPE_RELATIONAL: rcu_read_lock(); - if (!task_is_descendant(current, child) && + if (!pid_alive(child)) + rc = -EPERM; + if (!rc && !task_is_descendant(current, child) && !ptracer_exception_found(current, child) && !ns_capable(__task_cred(child)->user_ns, CAP_SYS_PTRACE)) rc = -EPERM;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanley Chu stanley.chu@mediatek.com
commit 3f7e62bba0003f9c68f599f5997c4647ef5b4f4e upstream.
The commit 356fd2663cff ("scsi: Set request queue runtime PM status back to active on resume") fixed up the inconsistent RPM status between request queue and device. However changing request queue RPM status shall be done only on successful resume, otherwise status may be still inconsistent as below,
Request queue: RPM_ACTIVE Device: RPM_SUSPENDED
This ends up soft lockup because requests can be submitted to underlying devices but those devices and their required resource are not resumed.
For example,
After above inconsistent status happens, IO request can be submitted to UFS device driver but required resource (like clock) is not resumed yet thus lead to warning as below call stack,
WARN_ON(hba->clk_gating.state != CLKS_ON); ufshcd_queuecommand scsi_dispatch_cmd scsi_request_fn __blk_run_queue cfq_insert_request __elv_add_request blk_flush_plug_list blk_finish_plug jbd2_journal_commit_transaction kjournald2
We may see all behind IO requests hang because of no response from storage host or device and then soft lockup happens in system. In the end, system may crash in many ways.
Fixes: 356fd2663cff (scsi: Set request queue runtime PM status back to active on resume) Cc: stable@vger.kernel.org Signed-off-by: Stanley Chu stanley.chu@mediatek.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/scsi_pm.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
--- a/drivers/scsi/scsi_pm.c +++ b/drivers/scsi/scsi_pm.c @@ -79,8 +79,22 @@ static int scsi_dev_type_resume(struct d
if (err == 0) { pm_runtime_disable(dev); - pm_runtime_set_active(dev); + err = pm_runtime_set_active(dev); pm_runtime_enable(dev); + + /* + * Forcibly set runtime PM status of request queue to "active" + * to make sure we can again get requests from the queue + * (see also blk_pm_peek_request()). + * + * The resume hook will correct runtime PM status of the disk. + */ + if (!err && scsi_is_sdev_device(dev)) { + struct scsi_device *sdev = to_scsi_device(dev); + + if (sdev->request_queue->dev) + blk_set_runtime_active(sdev->request_queue); + } }
return err; @@ -139,16 +153,6 @@ static int scsi_bus_resume_common(struct else fn = NULL;
- /* - * Forcibly set runtime PM status of request queue to "active" to - * make sure we can again get requests from the queue (see also - * blk_pm_peek_request()). - * - * The resume hook will correct runtime PM status of the disk. - */ - if (scsi_is_sdev_device(dev) && pm_runtime_suspended(dev)) - blk_set_runtime_active(to_scsi_device(dev)->request_queue); - if (fn) { async_schedule_domain(fn, dev, &scsi_sd_pm_domain);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ivan Mironov mironov.ivan@gmail.com
commit 44759979a49bfd2d20d789add7fa81a21eb1a4ab upstream.
Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may fail if device responds to MODE SENSE command with DPOFUA flag set, and then checks this flag to be not set on MODE SELECT command.
In this scenario, when trying to change cache_type, write always fails:
# echo "none" >cache_type bash: echo: write error: Invalid argument
And following appears in dmesg:
[13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current] [13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list
From SBC-4 r15, 6.5.1 "Mode pages overview", description of DEVICE-SPECIFIC
PARAMETER field in the mode parameter header: ... The write protect (WP) bit for mode data sent with a MODE SELECT command shall be ignored by the device server. ... The DPOFUA bit is reserved for mode data sent with a MODE SELECT command. ...
The remaining bits in the DEVICE-SPECIFIC PARAMETER byte are also reserved and shall be set to zero.
[mkp: shuffled commentary to commit description]
Cc: stable@vger.kernel.org Signed-off-by: Ivan Mironov mironov.ivan@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sd.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -205,6 +205,12 @@ cache_type_store(struct device *dev, str sp = buffer_data[0] & 0x80 ? 1 : 0; buffer_data[0] &= ~0x80;
+ /* + * Ensure WP, DPOFUA, and RESERVED fields are cleared in + * received mode parameter buffer before doing MODE SELECT. + */ + data.device_specific = 0; + if (scsi_mode_select(sdp, 1, sp, 8, buffer_data, len, SD_TIMEOUT, SD_MAX_RETRIES, &data, &sshdr)) { if (scsi_sense_valid(&sshdr))
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit 5a9372f751b5350e0ce3d2ee91832f1feae2c2e5 upstream.
While reading through the sysvipc implementation, I noticed that the n32 semctl/shmctl/msgctl system calls behave differently based on whether o32 support is enabled or not: Without o32, the IPC_64 flag passed by user space is rejected but calls without that flag get IPC_64 behavior.
As far as I can tell, this was inadvertently changed by a cleanup patch but never noticed by anyone, possibly nobody has tried using sysvipc on n32 after linux-3.19.
Change it back to the old behavior now.
Fixes: 78aaf956ba3a ("MIPS: Compat: Fix build error if CONFIG_MIPS32_COMPAT but no compat ABI.") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Paul Burton paul.burton@mips.com Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -3149,6 +3149,7 @@ config MIPS32_O32 config MIPS32_N32 bool "Kernel support for n32 binaries" depends on 64BIT + select ARCH_WANT_COMPAT_IPC_PARSE_VERSION select COMPAT select MIPS32_COMPAT select SYSVIPC_COMPAT if SYSVIPC
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafał Miłecki rafal@milecki.pl
commit 321c46b91550adc03054125fa7a1639390608e1a upstream.
So far we never had any device registered for the SoC. This resulted in some small issues that we kept ignoring like: 1) Not working GPIOLIB_IRQCHIP (gpiochip_irqchip_add_key() failing) 2) Lack of proper tree in the /sys/devices/ 3) mips_dma_alloc_coherent() silently handling empty coherent_dma_mask
Kernel 4.19 came with a lot of DMA changes and caused a regression on bcm47xx. Starting with the commit f8c55dc6e828 ("MIPS: use generic dma noncoherent ops for simple noncoherent platforms") DMA coherent allocations just fail. Example: [ 1.114914] bgmac_bcma bcma0:2: Allocation of TX ring 0x200 failed [ 1.121215] bgmac_bcma bcma0:2: Unable to alloc memory for DMA [ 1.127626] bgmac_bcma: probe of bcma0:2 failed with error -12 [ 1.133838] bgmac_bcma: Broadcom 47xx GBit MAC driver loaded
The bgmac driver also triggers a WARNING: [ 0.959486] ------------[ cut here ]------------ [ 0.964387] WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 bgmac_enet_probe+0x1b4/0x5c4 [ 0.973751] Modules linked in: [ 0.976913] CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.9 #0 [ 0.982750] Stack : 804a0000 804597c4 00000000 00000000 80458fd8 8381bc2c 838282d4 80481a47 [ 0.991367] 8042e3ec 00000001 804d38f0 00000204 83980000 00000065 8381bbe0 6f55b24f [ 0.999975] 00000000 00000000 80520000 00002018 00000000 00000075 00000007 00000000 [ 1.008583] 00000000 80480000 000ee811 00000000 00000000 00000000 80432c00 80248db8 [ 1.017196] 00000009 00000204 83980000 803ad7b0 00000000 801feeec 00000000 804d0000 [ 1.025804] ... [ 1.028325] Call Trace: [ 1.030875] [<8000aef8>] show_stack+0x58/0x100 [ 1.035513] [<8001f8b4>] __warn+0xe4/0x118 [ 1.039708] [<8001f9a4>] warn_slowpath_null+0x48/0x64 [ 1.044935] [<80248db8>] bgmac_enet_probe+0x1b4/0x5c4 [ 1.050101] [<802498e0>] bgmac_probe+0x558/0x590 [ 1.054906] [<80252fd0>] bcma_device_probe+0x38/0x70 [ 1.060017] [<8020e1e8>] really_probe+0x170/0x2e8 [ 1.064891] [<8020e714>] __driver_attach+0xa4/0xec [ 1.069784] [<8020c1e0>] bus_for_each_dev+0x58/0xb0 [ 1.074833] [<8020d590>] bus_add_driver+0xf8/0x218 [ 1.079731] [<8020ef24>] driver_register+0xcc/0x11c [ 1.084804] [<804b54cc>] bgmac_init+0x1c/0x44 [ 1.089258] [<8000121c>] do_one_initcall+0x7c/0x1a0 [ 1.094343] [<804a1d34>] kernel_init_freeable+0x150/0x218 [ 1.099886] [<803a082c>] kernel_init+0x10/0x104 [ 1.104583] [<80005878>] ret_from_kernel_thread+0x14/0x1c [ 1.110107] ---[ end trace f441c0d873d1fb5b ]---
This patch setups a "struct device" (and passes it to the bcma) which allows fixing all the mentioned problems. It'll also require a tiny bcma patch which will follow through the wireless tree & its maintainer.
Fixes: f8c55dc6e828 ("MIPS: use generic dma noncoherent ops for simple noncoherent platforms") Signed-off-by: Rafał Miłecki rafal@milecki.pl Signed-off-by: Paul Burton paul.burton@mips.com Acked-by: Hauke Mehrtens hauke@hauke-m.de Cc: Christoph Hellwig hch@lst.de Cc: Linus Walleij linus.walleij@linaro.org Cc: linux-wireless@vger.kernel.org Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/bcm47xx/setup.c | 31 +++++++++++++++++++++++++++++++ include/linux/bcma/bcma_soc.h | 1 + 2 files changed, 32 insertions(+)
--- a/arch/mips/bcm47xx/setup.c +++ b/arch/mips/bcm47xx/setup.c @@ -173,6 +173,31 @@ void __init plat_mem_setup(void) pm_power_off = bcm47xx_machine_halt; }
+#ifdef CONFIG_BCM47XX_BCMA +static struct device * __init bcm47xx_setup_device(void) +{ + struct device *dev; + int err; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return NULL; + + err = dev_set_name(dev, "bcm47xx_soc"); + if (err) { + pr_err("Failed to set SoC device name: %d\n", err); + kfree(dev); + return NULL; + } + + err = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(32)); + if (err) + pr_err("Failed to set SoC DMA mask: %d\n", err); + + return dev; +} +#endif + /* * This finishes bus initialization doing things that were not possible without * kmalloc. Make sure to call it late enough (after mm_init). @@ -183,6 +208,10 @@ void __init bcm47xx_bus_setup(void) if (bcm47xx_bus_type == BCM47XX_BUS_TYPE_BCMA) { int err;
+ bcm47xx_bus.bcma.dev = bcm47xx_setup_device(); + if (!bcm47xx_bus.bcma.dev) + panic("Failed to setup SoC device\n"); + err = bcma_host_soc_init(&bcm47xx_bus.bcma); if (err) panic("Failed to initialize BCMA bus (err %d)", err); @@ -235,6 +264,8 @@ static int __init bcm47xx_register_bus_c #endif #ifdef CONFIG_BCM47XX_BCMA case BCM47XX_BUS_TYPE_BCMA: + if (device_register(bcm47xx_bus.bcma.dev)) + pr_err("Failed to register SoC device\n"); bcma_bus_register(&bcm47xx_bus.bcma.bus); break; #endif --- a/include/linux/bcma/bcma_soc.h +++ b/include/linux/bcma/bcma_soc.h @@ -6,6 +6,7 @@
struct bcma_soc { struct bcma_bus bus; + struct device *dev; };
int __init bcma_host_soc_register(struct bcma_soc *soc);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hauke Mehrtens hauke@hauke-m.de
commit 2b4dba55b04b212a7fd1f0395b41d79ee3a9801b upstream.
This makes SMP on the vrx200 work again, by removing all the MIPS CPU interrupt specific code and making it fully use the generic MIPS CPU interrupt controller.
The mti,cpu-interrupt-controller from irq-mips-cpu.c now handles the CPU interrupts and also the IPI interrupts which are used to communication between the CPUs in a SMP system. The generic interrupt code was already used before but the interrupt vectors were overwritten again when we called set_vi_handler() in the lantiq interrupt driver and we also provided our own plat_irq_dispatch() function which overwrote the weak generic implementation. Now the code uses the generic handler for the MIPS CPU interrupts including the IPI interrupts and registers a handler for the CPU interrupts which are handled by the lantiq ICU with irq_set_chained_handler() which was already called before.
Calling the set_c0_status() function is also not needed any more because the generic MIPS CPU interrupt already activates the needed bits.
Fixes: 1eed40043579 ("MIPS: smp-mt: Use CPU interrupt controller IPI IRQ domain support") Cc: stable@kernel.org # v4.12 Signed-off-by: Hauke Mehrtens hauke@hauke-m.de Signed-off-by: Paul Burton paul.burton@mips.com Cc: jhogan@kernel.org Cc: ralf@linux-mips.org Cc: john@phrozen.org Cc: linux-mips@linux-mips.org Cc: linux-mips@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/lantiq/irq.c | 68 +++---------------------------------------------- 1 file changed, 5 insertions(+), 63 deletions(-)
--- a/arch/mips/lantiq/irq.c +++ b/arch/mips/lantiq/irq.c @@ -224,9 +224,11 @@ static struct irq_chip ltq_eiu_type = { .irq_set_type = ltq_eiu_settype, };
-static void ltq_hw_irqdispatch(int module) +static void ltq_hw_irq_handler(struct irq_desc *desc) { + int module = irq_desc_get_irq(desc) - 2; u32 irq; + int hwirq;
irq = ltq_icu_r32(module, LTQ_ICU_IM0_IOSR); if (irq == 0) @@ -237,7 +239,8 @@ static void ltq_hw_irqdispatch(int modul * other bits might be bogus */ irq = __fls(irq); - do_IRQ((int)irq + MIPS_CPU_IRQ_CASCADE + (INT_NUM_IM_OFFSET * module)); + hwirq = irq + MIPS_CPU_IRQ_CASCADE + (INT_NUM_IM_OFFSET * module); + generic_handle_irq(irq_linear_revmap(ltq_domain, hwirq));
/* if this is a EBU irq, we need to ack it or get a deadlock */ if ((irq == LTQ_ICU_EBU_IRQ) && (module == 0) && LTQ_EBU_PCC_ISTAT) @@ -245,49 +248,6 @@ static void ltq_hw_irqdispatch(int modul LTQ_EBU_PCC_ISTAT); }
-#define DEFINE_HWx_IRQDISPATCH(x) \ - static void ltq_hw ## x ## _irqdispatch(void) \ - { \ - ltq_hw_irqdispatch(x); \ - } -DEFINE_HWx_IRQDISPATCH(0) -DEFINE_HWx_IRQDISPATCH(1) -DEFINE_HWx_IRQDISPATCH(2) -DEFINE_HWx_IRQDISPATCH(3) -DEFINE_HWx_IRQDISPATCH(4) - -#if MIPS_CPU_TIMER_IRQ == 7 -static void ltq_hw5_irqdispatch(void) -{ - do_IRQ(MIPS_CPU_TIMER_IRQ); -} -#else -DEFINE_HWx_IRQDISPATCH(5) -#endif - -static void ltq_hw_irq_handler(struct irq_desc *desc) -{ - ltq_hw_irqdispatch(irq_desc_get_irq(desc) - 2); -} - -asmlinkage void plat_irq_dispatch(void) -{ - unsigned int pending = read_c0_status() & read_c0_cause() & ST0_IM; - int irq; - - if (!pending) { - spurious_interrupt(); - return; - } - - pending >>= CAUSEB_IP; - while (pending) { - irq = fls(pending) - 1; - do_IRQ(MIPS_CPU_IRQ_BASE + irq); - pending &= ~BIT(irq); - } -} - static int icu_map(struct irq_domain *d, unsigned int irq, irq_hw_number_t hw) { struct irq_chip *chip = <q_irq_type; @@ -343,28 +303,10 @@ int __init icu_of_init(struct device_nod for (i = 0; i < MAX_IM; i++) irq_set_chained_handler(i + 2, ltq_hw_irq_handler);
- if (cpu_has_vint) { - pr_info("Setting up vectored interrupts\n"); - set_vi_handler(2, ltq_hw0_irqdispatch); - set_vi_handler(3, ltq_hw1_irqdispatch); - set_vi_handler(4, ltq_hw2_irqdispatch); - set_vi_handler(5, ltq_hw3_irqdispatch); - set_vi_handler(6, ltq_hw4_irqdispatch); - set_vi_handler(7, ltq_hw5_irqdispatch); - } - ltq_domain = irq_domain_add_linear(node, (MAX_IM * INT_NUM_IM_OFFSET) + MIPS_CPU_IRQ_CASCADE, &irq_domain_ops, 0);
-#ifndef CONFIG_MIPS_MT_SMP - set_c0_status(IE_IRQ0 | IE_IRQ1 | IE_IRQ2 | - IE_IRQ3 | IE_IRQ4 | IE_IRQ5); -#else - set_c0_status(IE_SW0 | IE_SW1 | IE_IRQ0 | IE_IRQ1 | - IE_IRQ2 | IE_IRQ3 | IE_IRQ4 | IE_IRQ5); -#endif - /* tell oprofile which irq to use */ ltq_perfcount_irq = irq_create_mapping(ltq_domain, LTQ_PERF_IRQ);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhenyu Wang zhenyuw@linux.intel.com
commit 51b00d8509dc69c98740da2ad07308b630d3eb7d upstream.
This is to fix missed mmap range check on vGPU bar2 region and only allow to map vGPU allocated GMADDR range, which means user space should support sparse mmap to get proper offset for mmap vGPU aperture. And this takes care of actual pgoff in mmap request as original code always does from beginning of vGPU aperture.
Fixes: 659643f7d814 ("drm/i915/gvt/kvmgt: add vfio/mdev support to KVMGT") Cc: "Monroy, Rodrigo Axel" rodrigo.axel.monroy@intel.com Cc: "Orrala Contreras, Alfredo" alfredo.orrala.contreras@intel.com Cc: stable@vger.kernel.org # v4.10+ Reviewed-by: Hang Yuan hang.yuan@intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/gvt/kvmgt.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -996,7 +996,7 @@ static int intel_vgpu_mmap(struct mdev_d { unsigned int index; u64 virtaddr; - unsigned long req_size, pgoff = 0; + unsigned long req_size, pgoff, req_start; pgprot_t pg_prot; struct intel_vgpu *vgpu = mdev_get_drvdata(mdev);
@@ -1014,7 +1014,17 @@ static int intel_vgpu_mmap(struct mdev_d pg_prot = vma->vm_page_prot; virtaddr = vma->vm_start; req_size = vma->vm_end - vma->vm_start; - pgoff = vgpu_aperture_pa_base(vgpu) >> PAGE_SHIFT; + pgoff = vma->vm_pgoff & + ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1); + req_start = pgoff << PAGE_SHIFT; + + if (!intel_vgpu_in_aperture(vgpu, req_start)) + return -EINVAL; + if (req_start + req_size > + vgpu_aperture_offset(vgpu) + vgpu_aperture_sz(vgpu)) + return -EINVAL; + + pgoff = (gvt_aperture_pa_base(vgpu->gvt) >> PAGE_SHIFT) + pgoff;
return remap_pfn_range(vma, virtaddr, pgoff, req_size, pg_prot); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Julia Lawall Julia.Lawall@lip6.fr
commit 28b170e88bc0c7509e6724717c15cb4b5686026e upstream.
Add an of_node_put when the result of of_graph_get_remote_port_parent is not available.
The semantic match that finds this problem is as follows (http://coccinelle.lip6.fr):
// <smpl> @r exists@ local idexpression e; expression x; @@ e = of_graph_get_remote_port_parent(...); ... when != x = e when != true e == NULL when != of_node_put(e) when != of_fwnode_handle(e) ( return e; | *return ...; ) // </smpl>
Signed-off-by: Julia Lawall Julia.Lawall@lip6.fr Cc: stable@vger.kernel.org Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/of/property.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -806,6 +806,7 @@ struct device_node *of_graph_get_remote_
if (!of_device_is_available(remote)) { pr_debug("not available for remote node\n"); + of_node_put(remote); return NULL; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Hunter jonathanh@nvidia.com
commit ac4ca4b9f4623ba5e1ea7a582f286567c611e027 upstream.
The tps6586x driver creates an irqchip that is used by its various child devices for managing interrupts. The tps6586x-rtc device is one of its children that uses the tps6586x irqchip. When using the tps6586x-rtc as a wake-up device from suspend, the following is seen:
PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.001 seconds) done. OOM killer disabled. Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done. Disabling non-boot CPUs ... Entering suspend state LP1 Enabling non-boot CPUs ... CPU1 is up tps6586x 3-0034: failed to read interrupt status tps6586x 3-0034: failed to read interrupt status
The reason why the tps6586x interrupt status cannot be read is because the tps6586x interrupt is not masked during suspend and when the tps6586x-rtc interrupt occurs, to wake-up the device, the interrupt is seen before the i2c controller has been resumed in order to read the tps6586x interrupt status.
The tps6586x-rtc driver sets it's interrupt as a wake-up source during suspend, which gets propagated to the parent tps6586x interrupt. However, the tps6586x-rtc driver cannot disable it's interrupt during suspend otherwise we would never be woken up and so the tps6586x must disable it's interrupt instead.
Prevent the tps6586x interrupt handler from executing on exiting suspend before the i2c controller has been resumed by disabling the tps6586x interrupt on entering suspend and re-enabling it on resuming from suspend.
Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter jonathanh@nvidia.com Reviewed-by: Dmitry Osipenko digetx@gmail.com Tested-by: Dmitry Osipenko digetx@gmail.com Acked-by: Thierry Reding treding@nvidia.com Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mfd/tps6586x.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/drivers/mfd/tps6586x.c +++ b/drivers/mfd/tps6586x.c @@ -592,6 +592,29 @@ static int tps6586x_i2c_remove(struct i2 return 0; }
+static int __maybe_unused tps6586x_i2c_suspend(struct device *dev) +{ + struct tps6586x *tps6586x = dev_get_drvdata(dev); + + if (tps6586x->client->irq) + disable_irq(tps6586x->client->irq); + + return 0; +} + +static int __maybe_unused tps6586x_i2c_resume(struct device *dev) +{ + struct tps6586x *tps6586x = dev_get_drvdata(dev); + + if (tps6586x->client->irq) + enable_irq(tps6586x->client->irq); + + return 0; +} + +static SIMPLE_DEV_PM_OPS(tps6586x_pm_ops, tps6586x_i2c_suspend, + tps6586x_i2c_resume); + static const struct i2c_device_id tps6586x_id_table[] = { { "tps6586x", 0 }, { }, @@ -602,6 +625,7 @@ static struct i2c_driver tps6586x_driver .driver = { .name = "tps6586x", .of_match_table = of_match_ptr(tps6586x_of_match), + .pm = &tps6586x_pm_ops, }, .probe = tps6586x_i2c_probe, .remove = tps6586x_i2c_remove,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
commit 7fe9f01c04c2673bd6662c35b664f0f91888b96f upstream.
The num_planes field in struct v4l2_pix_format_mplane is used in a loop before validating it. As the use is printing a debug message in this case, just cap the value to the maximum allowed.
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Thierry Reding treding@nvidia.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Cc: stable@vger.kernel.org # for v4.12 and up Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/v4l2-core/v4l2-ioctl.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/media/v4l2-core/v4l2-ioctl.c +++ b/drivers/media/v4l2-core/v4l2-ioctl.c @@ -286,6 +286,7 @@ static void v4l_print_format(const void const struct v4l2_window *win; const struct v4l2_sdr_format *sdr; const struct v4l2_meta_format *meta; + u32 planes; unsigned i;
pr_cont("type=%s", prt_names(p->type, v4l2_type_names)); @@ -316,7 +317,8 @@ static void v4l_print_format(const void prt_names(mp->field, v4l2_field_names), mp->colorspace, mp->num_planes, mp->flags, mp->ycbcr_enc, mp->quantization, mp->xfer_func); - for (i = 0; i < mp->num_planes; i++) + planes = min_t(u32, mp->num_planes, VIDEO_MAX_PLANES); + for (i = 0; i < planes; i++) printk(KERN_DEBUG "plane %u: bytesperline=%u sizeimage=%u\n", i, mp->plane_fmt[i].bytesperline, mp->plane_fmt[i].sizeimage);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Romanovsky leonro@mellanox.com
commit a9666c1cae8dbcd1a9aacd08a778bf2a28eea300 upstream.
Unsafe global rkey is considered dangerous because it exposes memory registered for all memory in the system. Only users with a QP on the same PD can use the rkey, and generally those QPs will already know the value. However, out of caution, do not expose the value to unprivleged users on the local system. Require CAP_NET_ADMIN instead.
Cc: stable@vger.kernel.org # 4.16 Fixes: 29cf1351d450 ("RDMA/nldev: provide detailed PD information") Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/core/nldev.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -579,10 +579,6 @@ static int fill_res_pd_entry(struct sk_b if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_RES_USECNT, atomic_read(&pd->usecnt), RDMA_NLDEV_ATTR_PAD)) goto err; - if ((pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY) && - nla_put_u32(msg, RDMA_NLDEV_ATTR_RES_UNSAFE_GLOBAL_RKEY, - pd->unsafe_global_rkey)) - goto err;
if (fill_res_name_pid(msg, res)) goto err;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adit Ranadive aditr@vmware.com
commit 6325e01b6cdf4636b721cf7259c1616e3cf28ce2 upstream.
Since the IB_WR_REG_MR opcode value changed, let's set the PVRDMA device opcodes explicitly.
Reported-by: Ruishuang Wang ruishuangw@vmware.com Fixes: 9a59739bd01f ("IB/rxe: Revise the ib_wr_opcode enum") Cc: stable@vger.kernel.org Reviewed-by: Bryan Tan bryantan@vmware.com Reviewed-by: Ruishuang Wang ruishuangw@vmware.com Reviewed-by: Vishnu Dasa vdasa@vmware.com Signed-off-by: Adit Ranadive aditr@vmware.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/vmw_pvrdma/pvrdma.h | 35 ++++++++++++++++++++++++++- drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c | 6 ++++ include/uapi/rdma/vmw_pvrdma-abi.h | 1 3 files changed, 41 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma.h +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma.h @@ -427,7 +427,40 @@ static inline enum ib_qp_state pvrdma_qp
static inline enum pvrdma_wr_opcode ib_wr_opcode_to_pvrdma(enum ib_wr_opcode op) { - return (enum pvrdma_wr_opcode)op; + switch (op) { + case IB_WR_RDMA_WRITE: + return PVRDMA_WR_RDMA_WRITE; + case IB_WR_RDMA_WRITE_WITH_IMM: + return PVRDMA_WR_RDMA_WRITE_WITH_IMM; + case IB_WR_SEND: + return PVRDMA_WR_SEND; + case IB_WR_SEND_WITH_IMM: + return PVRDMA_WR_SEND_WITH_IMM; + case IB_WR_RDMA_READ: + return PVRDMA_WR_RDMA_READ; + case IB_WR_ATOMIC_CMP_AND_SWP: + return PVRDMA_WR_ATOMIC_CMP_AND_SWP; + case IB_WR_ATOMIC_FETCH_AND_ADD: + return PVRDMA_WR_ATOMIC_FETCH_AND_ADD; + case IB_WR_LSO: + return PVRDMA_WR_LSO; + case IB_WR_SEND_WITH_INV: + return PVRDMA_WR_SEND_WITH_INV; + case IB_WR_RDMA_READ_WITH_INV: + return PVRDMA_WR_RDMA_READ_WITH_INV; + case IB_WR_LOCAL_INV: + return PVRDMA_WR_LOCAL_INV; + case IB_WR_REG_MR: + return PVRDMA_WR_FAST_REG_MR; + case IB_WR_MASKED_ATOMIC_CMP_AND_SWP: + return PVRDMA_WR_MASKED_ATOMIC_CMP_AND_SWP; + case IB_WR_MASKED_ATOMIC_FETCH_AND_ADD: + return PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD; + case IB_WR_REG_SIG_MR: + return PVRDMA_WR_REG_SIG_MR; + default: + return PVRDMA_WR_ERROR; + } }
static inline enum ib_wc_status pvrdma_wc_status_to_ib( --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c @@ -721,6 +721,12 @@ int pvrdma_post_send(struct ib_qp *ibqp, wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM) wqe_hdr->ex.imm_data = wr->ex.imm_data;
+ if (unlikely(wqe_hdr->opcode == PVRDMA_WR_ERROR)) { + *bad_wr = wr; + ret = -EINVAL; + goto out; + } + switch (qp->ibqp.qp_type) { case IB_QPT_GSI: case IB_QPT_UD: --- a/include/uapi/rdma/vmw_pvrdma-abi.h +++ b/include/uapi/rdma/vmw_pvrdma-abi.h @@ -78,6 +78,7 @@ enum pvrdma_wr_opcode { PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD, PVRDMA_WR_BIND_MW, PVRDMA_WR_REG_SIG_MR, + PVRDMA_WR_ERROR, };
enum pvrdma_wc_status {
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Burton paul.burton@mips.com
commit 16fd20aa98080c2fa666dc384036ec08c80af710 upstream.
When building using GCC 4.7 or older, -ffunction-sections & the -pg flag used by ftrace are incompatible. This causes warnings or build failures (where -Werror applies) such as the following:
arch/mips/generic/init.c: error: -ffunction-sections disabled; it makes profiling impossible
This used to be taken into account by the ordering of calls to cc-option from within the top-level Makefile, which was introduced by commit 90ad4052e85c ("kbuild: avoid conflict between -ffunction-sections and -pg on gcc-4.7"). Unfortunately this was broken when the CONFIG_LD_DEAD_CODE_DATA_ELIMINATION cc-option check was moved to Kconfig in commit e85d1d65cd8a ("kbuild: test dead code/data elimination support in Kconfig"), because the flags used by this check no longer include -pg.
Fix this by not allowing CONFIG_LD_DEAD_CODE_DATA_ELIMINATION to be enabled at the same time as ftrace/CONFIG_FUNCTION_TRACER when building using GCC 4.7 or older.
Signed-off-by: Paul Burton paul.burton@mips.com Fixes: e85d1d65cd8a ("kbuild: test dead code/data elimination support in Kconfig") Reported-by: Geert Uytterhoeven geert@linux-m68k.org Cc: Nicholas Piggin npiggin@gmail.com Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- init/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/init/Kconfig +++ b/init/Kconfig @@ -1102,6 +1102,7 @@ config LD_DEAD_CODE_DATA_ELIMINATION bool "Dead code and data elimination (EXPERIMENTAL)" depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION depends on EXPERT + depends on !(FUNCTION_TRACER && CC_IS_GCC && GCC_VERSION < 40800) depends on $(cc-option,-ffunction-sections -fdata-sections) depends on $(ld-option,--gc-sections) help
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 3f1bb6abdf19cfa89860b3bc9e7f31b44b6a0ba1 upstream.
Use the new of_get_compatible_child() helper to look up child nodes to avoid ever matching non-child nodes elsewhere in the tree.
Also fix up the related struct device_node leaks.
Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver") Cc: stable stable@vger.kernel.org # 4.19: 36156f9241cb0 Cc: Linus Walleij linus.walleij@linaro.org Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/dsa/realtek-smi.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
--- a/drivers/net/dsa/realtek-smi.c +++ b/drivers/net/dsa/realtek-smi.c @@ -347,16 +347,17 @@ int realtek_smi_setup_mdio(struct realte struct device_node *mdio_np; int ret;
- mdio_np = of_find_compatible_node(smi->dev->of_node, NULL, - "realtek,smi-mdio"); + mdio_np = of_get_compatible_child(smi->dev->of_node, "realtek,smi-mdio"); if (!mdio_np) { dev_err(smi->dev, "no MDIO bus node\n"); return -ENODEV; }
smi->slave_mii_bus = devm_mdiobus_alloc(smi->dev); - if (!smi->slave_mii_bus) - return -ENOMEM; + if (!smi->slave_mii_bus) { + ret = -ENOMEM; + goto err_put_node; + } smi->slave_mii_bus->priv = smi; smi->slave_mii_bus->name = "SMI slave MII"; smi->slave_mii_bus->read = realtek_smi_mdio_read; @@ -371,10 +372,15 @@ int realtek_smi_setup_mdio(struct realte if (ret) { dev_err(smi->dev, "unable to register MDIO bus %s\n", smi->slave_mii_bus->id); - of_node_put(mdio_np); + goto err_put_node; }
return 0; + +err_put_node: + of_node_put(mdio_np); + + return ret; }
static int realtek_smi_probe(struct platform_device *pdev) @@ -457,6 +463,8 @@ static int realtek_smi_remove(struct pla struct realtek_smi *smi = dev_get_drvdata(&pdev->dev);
dsa_unregister_switch(smi->ds); + if (smi->slave_mii_bus) + of_node_put(smi->slave_mii_bus->dev.of_node); gpiod_set_value(smi->reset, 1);
return 0;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
commit 5631e8576a3caf606cdc375f97425a67983b420c upstream.
Yue Hu noticed that when parsing device tree the allocated platform data was never freed. Since it's not used beyond the function scope, this switches to using a stack variable instead.
Reported-by: Yue Hu huyue2@yulong.com Fixes: 35da60941e44 ("pstore/ram: add Device Tree bindings") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/pstore/ram.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/fs/pstore/ram.c +++ b/fs/pstore/ram.c @@ -713,18 +713,15 @@ static int ramoops_probe(struct platform { struct device *dev = &pdev->dev; struct ramoops_platform_data *pdata = dev->platform_data; + struct ramoops_platform_data pdata_local; struct ramoops_context *cxt = &oops_cxt; size_t dump_mem_sz; phys_addr_t paddr; int err = -EINVAL;
if (dev_of_node(dev) && !pdata) { - pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL); - if (!pdata) { - pr_err("cannot allocate platform data buffer\n"); - err = -ENOMEM; - goto fail_out; - } + pdata = &pdata_local; + memset(pdata, 0, sizeof(*pdata));
err = ramoops_parse_dt(pdev, pdata); if (err < 0)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit 1598ecda7b239e9232dda032bfddeed9d89fab6c upstream.
kaslr_early_init() is called with the kernel mapped at its link time offset, and if it returns with a non-zero offset, the kernel is unmapped and remapped again at the randomized offset.
During its execution, kaslr_early_init() also randomizes the base of the module region and of the linear mapping of DRAM, and sets two variables accordingly. However, since these variables are assigned with the caches on, they may get lost during the cache maintenance that occurs when unmapping and remapping the kernel, so ensure that these values are cleaned to the PoC.
Acked-by: Catalin Marinas catalin.marinas@arm.com Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/kaslr.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/arm64/kernel/kaslr.c +++ b/arch/arm64/kernel/kaslr.c @@ -14,6 +14,7 @@ #include <linux/sched.h> #include <linux/types.h>
+#include <asm/cacheflush.h> #include <asm/fixmap.h> #include <asm/kernel-pgtable.h> #include <asm/memory.h> @@ -43,7 +44,7 @@ static __init u64 get_kaslr_seed(void *f return ret; }
-static __init const u8 *get_cmdline(void *fdt) +static __init const u8 *kaslr_get_cmdline(void *fdt) { static __initconst const u8 default_cmdline[] = CONFIG_CMDLINE;
@@ -109,7 +110,7 @@ u64 __init kaslr_early_init(u64 dt_phys) * Check if 'nokaslr' appears on the command line, and * return 0 if that is the case. */ - cmdline = get_cmdline(fdt); + cmdline = kaslr_get_cmdline(fdt); str = strstr(cmdline, "nokaslr"); if (str == cmdline || (str > cmdline && *(str - 1) == ' ')) return 0; @@ -169,5 +170,8 @@ u64 __init kaslr_early_init(u64 dt_phys) module_alloc_base += (module_range * (seed & ((1 << 21) - 1))) >> 21; module_alloc_base &= PAGE_MASK;
+ __flush_dcache_area(&module_alloc_base, sizeof(module_alloc_base)); + __flush_dcache_area(&memstart_offset_seed, sizeof(memstart_offset_seed)); + return offset; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heinrich Schuchardt xypron.glpk@gmx.de
commit 132ac39cffbcfed80ada38ef0fc6d34d95da7be6 upstream.
The memory area [0x4000000-0x4200000[ is occupied by the PSCI firmware. Any attempt to access it from Linux leads to an immediate crash.
So let's make the same memory reservation as the vendor kernel.
[gregory: added as comment that this region matches the mainline U-boot] Signed-off-by: Heinrich Schuchardt xypron.glpk@gmx.de Signed-off-by: Gregory CLEMENT gregory.clement@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/boot/dts/marvell/armada-ap806.dtsi | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
--- a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi @@ -27,6 +27,23 @@ method = "smc"; };
+ reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + ranges; + + /* + * This area matches the mapping done with a + * mainline U-Boot, and should be updated by the + * bootloader. + */ + + psci-area@4000000 { + reg = <0x0 0x4000000 0x0 0x200000>; + no-map; + }; + }; + ap806 { #address-cells = <2>; #size-cells = <2>;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: YunQiang Su ysu@wavecomp.com
commit a214720cbf50cd8c3f76bbb9c3f5c283910e9d33 upstream.
Octeon has an boot-time option to disable pcie.
Since MSI depends on PCI-E, we should also disable MSI also with this option is on in order to avoid inadvertently accessing PCIe registers.
Signed-off-by: YunQiang Su ysu@wavecomp.com Signed-off-by: Paul Burton paul.burton@mips.com Cc: pburton@wavecomp.com Cc: linux-mips@vger.kernel.org Cc: aaro.koskinen@iki.fi Cc: stable@vger.kernel.org # v3.3+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/pci/msi-octeon.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/mips/pci/msi-octeon.c +++ b/arch/mips/pci/msi-octeon.c @@ -369,7 +369,9 @@ int __init octeon_msi_initialize(void) int irq; struct irq_chip *msi;
- if (octeon_dma_bar_type == OCTEON_DMA_BAR_TYPE_PCIE) { + if (octeon_dma_bar_type == OCTEON_DMA_BAR_TYPE_INVALID) { + return 0; + } else if (octeon_dma_bar_type == OCTEON_DMA_BAR_TYPE_PCIE) { msi_rcv_reg[0] = CVMX_PEXP_NPEI_MSI_RCV0; msi_rcv_reg[1] = CVMX_PEXP_NPEI_MSI_RCV1; msi_rcv_reg[2] = CVMX_PEXP_NPEI_MSI_RCV2;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian La Roche florian.laroche@googlemail.com
commit fbfaf851902cd9293f392f3a1735e0543016d530 upstream.
If an input number x for int_sqrt64() has the highest bit set, then fls64(x) is 64. (1UL << 64) is an overflow and breaks the algorithm.
Subtracting 1 is a better guess for the initial value of m anyway and that's what also done in int_sqrt() implicitly [*].
[*] Note how int_sqrt() uses __fls() with two underscores, which already returns the proper raw bit number.
In contrast, int_sqrt64() used fls64(), and that returns bit numbers illogically starting at 1, because of error handling for the "no bits set" case. Will points out that he bug probably is due to a copy-and-paste error from the regular int_sqrt() case.
Signed-off-by: Florian La Roche Florian.LaRoche@googlemail.com Acked-by: Will Deacon will.deacon@arm.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- lib/int_sqrt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -52,7 +52,7 @@ u32 int_sqrt64(u64 x) if (x <= ULONG_MAX) return int_sqrt((unsigned long) x);
- m = 1ULL << (fls64(x) & ~1ULL); + m = 1ULL << ((fls64(x) - 1) & ~1ULL); while (m != 0) { b = y + m; y >>= 1;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Tsyrklevich vlad@tsyrklevich.net
commit a01421e4484327fe44f8e126793ed5a48a221e24 upstream.
Using [1] for static analysis I found that the OMAPFB_QUERY_PLANE, OMAPFB_GET_COLOR_KEY, OMAPFB_GET_DISPLAY_INFO, and OMAPFB_GET_VRAM_INFO cases could all leak uninitialized stack memory--either due to uninitialized padding or 'reserved' fields.
Fix them by clearing the shared union used to store copied out data.
[1] https://github.com/vlad902/kernel-uninitialized-memory-checker
Signed-off-by: Vlad Tsyrklevich vlad@tsyrklevich.net Reviewed-by: Kees Cook keescook@chromium.org Fixes: b39a982ddecf ("OMAP: DSS2: omapfb driver") Cc: security@kernel.org [b.zolnierkie: prefix patch subject with "omap2fb: "] Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c +++ b/drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c @@ -609,6 +609,8 @@ int omapfb_ioctl(struct fb_info *fbi, un
int r = 0;
+ memset(&p, 0, sizeof(p)); + switch (cmd) { case OMAPFB_SYNC_GFX: DBG("ioctl SYNC_GFX\n");
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
commit 701f49bc028edb19ffccd101997dd84f0d71e279 upstream.
kthread_run returns an error pointer, but elsewhere in the code dev->kthread_vid_cap/out is checked against NULL.
If kthread_run returns an error, then set the pointer to NULL.
I chose this method over changing all kthread_vid_cap/out tests elsewhere since this is more robust.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Reported-by: syzbot+53d5b2df0d9744411e2e@syzkaller.appspotmail.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vivid/vivid-kthread-cap.c | 5 ++++- drivers/media/platform/vivid/vivid-kthread-out.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/media/platform/vivid/vivid-kthread-cap.c +++ b/drivers/media/platform/vivid/vivid-kthread-cap.c @@ -865,8 +865,11 @@ int vivid_start_generating_vid_cap(struc "%s-vid-cap", dev->v4l2_dev.name);
if (IS_ERR(dev->kthread_vid_cap)) { + int err = PTR_ERR(dev->kthread_vid_cap); + + dev->kthread_vid_cap = NULL; v4l2_err(&dev->v4l2_dev, "kernel_thread() failed\n"); - return PTR_ERR(dev->kthread_vid_cap); + return err; } *pstreaming = true; vivid_grab_controls(dev, true); --- a/drivers/media/platform/vivid/vivid-kthread-out.c +++ b/drivers/media/platform/vivid/vivid-kthread-out.c @@ -236,8 +236,11 @@ int vivid_start_generating_vid_out(struc "%s-vid-out", dev->v4l2_dev.name);
if (IS_ERR(dev->kthread_vid_out)) { + int err = PTR_ERR(dev->kthread_vid_out); + + dev->kthread_vid_out = NULL; v4l2_err(&dev->v4l2_dev, "kernel_thread() failed\n"); - return PTR_ERR(dev->kthread_vid_out); + return err; } *pstreaming = true; vivid_grab_controls(dev, true);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
commit 9729d6d282a6d7ce88e64c9119cecdf79edf4e88 upstream.
The capture DV timings capabilities allowed for a minimum width and height of 0. So passing a timings struct with 0 values is allowed and will later cause a division by zero.
Ensure that the width and height must be >= 16 to avoid this.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Reported-by: syzbot+57c3d83d71187054d56f@syzkaller.appspotmail.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vivid/vivid-vid-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/vivid/vivid-vid-common.c +++ b/drivers/media/platform/vivid/vivid-vid-common.c @@ -21,7 +21,7 @@ const struct v4l2_dv_timings_cap vivid_d .type = V4L2_DV_BT_656_1120, /* keep this initialization for compatibility with GCC < 4.4.6 */ .reserved = { 0 }, - V4L2_INIT_BT_TIMINGS(0, MAX_WIDTH, 0, MAX_HEIGHT, 14000000, 775000000, + V4L2_INIT_BT_TIMINGS(16, MAX_WIDTH, 16, MAX_HEIGHT, 14000000, 775000000, V4L2_DV_BT_STD_CEA861 | V4L2_DV_BT_STD_DMT | V4L2_DV_BT_STD_CVT | V4L2_DV_BT_STD_GTF, V4L2_DV_BT_CAP_PROGRESSIVE | V4L2_DV_BT_CAP_INTERLACED)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
commit e7c87bd6cc4ec7b0ac1ed0a88a58f8206c577488 upstream.
Syzkaller was able to construct a packet of negative length by redirecting from bpf_prog_test_run_skb with BPF_PROG_TYPE_LWT_XMIT:
BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:345 [inline] BUG: KASAN: slab-out-of-bounds in skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline] BUG: KASAN: slab-out-of-bounds in __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395 Read of size 4294967282 at addr ffff8801d798009c by task syz-executor2/12942
kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 memcpy+0x23/0x50 mm/kasan/kasan.c:302 memcpy include/linux/string.h:345 [inline] skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline] __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395 __pskb_copy include/linux/skbuff.h:1053 [inline] pskb_copy include/linux/skbuff.h:2904 [inline] skb_realloc_headroom+0xe7/0x120 net/core/skbuff.c:1539 ipip6_tunnel_xmit net/ipv6/sit.c:965 [inline] sit_tunnel_xmit+0xe1b/0x30d0 net/ipv6/sit.c:1029 __netdev_start_xmit include/linux/netdevice.h:4325 [inline] netdev_start_xmit include/linux/netdevice.h:4334 [inline] xmit_one net/core/dev.c:3219 [inline] dev_hard_start_xmit+0x295/0xc90 net/core/dev.c:3235 __dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805 dev_queue_xmit+0x17/0x20 net/core/dev.c:3838 __bpf_tx_skb net/core/filter.c:2016 [inline] __bpf_redirect_common net/core/filter.c:2054 [inline] __bpf_redirect+0x5cf/0xb20 net/core/filter.c:2061 ____bpf_clone_redirect net/core/filter.c:2094 [inline] bpf_clone_redirect+0x2f6/0x490 net/core/filter.c:2066 bpf_prog_41f2bcae09cd4ac3+0xb25/0x1000
The generated test constructs a packet with mac header, network header, skb->data pointing to network header and skb->len 0.
Redirecting to a sit0 through __bpf_redirect_no_mac pulls the mac length, even though skb->data already is at skb->network_header. bpf_prog_test_run_skb has already pulled it as LWT_XMIT !is_l2.
Update the offset calculation to pull only if skb->data differs from skb->network_header, which is not true in this case.
The test itself can be run only from commit 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command"), but the same type of packets with skb at network header could already be built from lwt xmit hooks, so this fix is more relevant to that commit.
Also set the mac header on redirect from LWT_XMIT, as even after this change to __bpf_redirect_no_mac that field is expected to be set, but is not yet in ip_finish_output2.
Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Willem de Bruijn willemb@google.com Acked-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/core/filter.c | 21 +++++++++++---------- net/core/lwt_bpf.c | 1 + 2 files changed, 12 insertions(+), 10 deletions(-)
--- a/net/core/filter.c +++ b/net/core/filter.c @@ -2018,18 +2018,19 @@ static inline int __bpf_tx_skb(struct ne static int __bpf_redirect_no_mac(struct sk_buff *skb, struct net_device *dev, u32 flags) { - /* skb->mac_len is not set on normal egress */ - unsigned int mlen = skb->network_header - skb->mac_header; + unsigned int mlen = skb_network_offset(skb);
- __skb_pull(skb, mlen); + if (mlen) { + __skb_pull(skb, mlen);
- /* At ingress, the mac header has already been pulled once. - * At egress, skb_pospull_rcsum has to be done in case that - * the skb is originated from ingress (i.e. a forwarded skb) - * to ensure that rcsum starts at net header. - */ - if (!skb_at_tc_ingress(skb)) - skb_postpull_rcsum(skb, skb_mac_header(skb), mlen); + /* At ingress, the mac header has already been pulled once. + * At egress, skb_pospull_rcsum has to be done in case that + * the skb is originated from ingress (i.e. a forwarded skb) + * to ensure that rcsum starts at net header. + */ + if (!skb_at_tc_ingress(skb)) + skb_postpull_rcsum(skb, skb_mac_header(skb), mlen); + } skb_pop_mac_header(skb); skb_reset_mac_len(skb); return flags & BPF_F_INGRESS ? --- a/net/core/lwt_bpf.c +++ b/net/core/lwt_bpf.c @@ -63,6 +63,7 @@ static int run_lwt_bpf(struct sk_buff *s lwt->name ? : "<unknown>"); ret = BPF_OK; } else { + skb_reset_mac_header(skb); ret = skb_do_redirect(skb); if (ret == 0) ret = BPF_REDIRECT;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 8d933670452107e41165bea70a30dffbd281bef1 upstream.
syzbot was able to crash one host with the following stack trace :
kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8 RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline] RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426 icmpv6_send smack_socket_sock_rcv_skb security_sock_rcv_skb sk_filter_trim_cap __sk_receive_skb dccp_v6_do_rcv release_sock
This is because a RX packet found socket owned by user and was stored into socket backlog. Before leaving RCU protected section, skb->dev was cleared in __sk_receive_skb(). When socket backlog was finally handled at release_sock() time, skb was fed to smack_socket_sock_rcv_skb() then icmp6_send()
We could fix the bug in smack_socket_sock_rcv_skb(), or simply make icmp6_send() more robust against such possibility.
In the future we might provide to icmp6_send() the net pointer instead of infering it.
Fixes: d66a8acbda92 ("Smack: Inform peer that IPv6 traffic has been blocked") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Piotr Sawicki p.sawicki2@partner.samsung.com Cc: Casey Schaufler casey@schaufler-ca.com Reported-by: syzbot syzkaller@googlegroups.com Acked-by: Casey Schaufler casey@schaufler-ca.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/ipv6/icmp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -421,10 +421,10 @@ static int icmp6_iif(const struct sk_buf static void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info, const struct in6_addr *force_saddr) { - struct net *net = dev_net(skb->dev); struct inet6_dev *idev = NULL; struct ipv6hdr *hdr = ipv6_hdr(skb); struct sock *sk; + struct net *net; struct ipv6_pinfo *np; const struct in6_addr *saddr = NULL; struct dst_entry *dst; @@ -435,12 +435,16 @@ static void icmp6_send(struct sk_buff *s int iif = 0; int addr_type = 0; int len; - u32 mark = IP6_REPLY_MARK(net, skb->mark); + u32 mark;
if ((u8 *)hdr < skb->head || (skb_network_header(skb) + sizeof(*hdr)) > skb_tail_pointer(skb)) return;
+ if (!skb->dev) + return; + net = dev_net(skb->dev); + mark = IP6_REPLY_MARK(net, skb->mark); /* * Make sure we respect the rules * i.e. RFC 1885 2.4(e)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morris james.morris@microsoft.com
commit a5795fd38ee8194451ba3f281f075301a3696ce2 upstream.
From: Casey Schaufler casey@schaufler-ca.com
Check that the cred security blob has been set before trying to clean it up. There is a case during credential initialization that could result in this.
Signed-off-by: Casey Schaufler casey@schaufler-ca.com Acked-by: John Johansen john.johansen@canonical.com Signed-off-by: James Morris james.morris@microsoft.com Reported-by: syzbot+69ca07954461f189e808@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/security.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/security/security.c +++ b/security/security.c @@ -1003,6 +1003,13 @@ int security_cred_alloc_blank(struct cre
void security_cred_free(struct cred *cred) { + /* + * There is a failure case in prepare_creds() that + * may result in a call here with ->security being NULL. + */ + if (unlikely(cred->security == NULL)) + return; + call_void_hook(cred_free, cred); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil@xs4all.nl
commit cd26d1c4d1bc947b56ae404998ae2276df7b39b7 upstream.
If a filehandle is dup()ped, then it is possible to close it from one fd and call mmap from the other. This creates a race condition in vb2_mmap where it is using queue data that __vb2_queue_free (called from close()) is in the process of releasing.
By moving up the mutex_lock(mmap_lock) in vb2_mmap this race is avoided since __vb2_queue_free is called with the same mutex locked. So vb2_mmap now reads consistent buffer data.
Signed-off-by: Hans Verkuil hverkuil@xs4all.nl Reported-by: syzbot+be93025dd45dccd8923c@syzkaller.appspotmail.com Signed-off-by: Hans Verkuil hansverk@cisco.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/common/videobuf2/videobuf2-core.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/media/common/videobuf2/videobuf2-core.c +++ b/drivers/media/common/videobuf2/videobuf2-core.c @@ -1933,9 +1933,13 @@ int vb2_mmap(struct vb2_queue *q, struct return -EINVAL; } } + + mutex_lock(&q->mmap_lock); + if (vb2_fileio_is_active(q)) { dprintk(1, "mmap: file io in progress\n"); - return -EBUSY; + ret = -EBUSY; + goto unlock; }
/* @@ -1943,7 +1947,7 @@ int vb2_mmap(struct vb2_queue *q, struct */ ret = __find_plane_by_offset(q, off, &buffer, &plane); if (ret) - return ret; + goto unlock;
vb = q->bufs[buffer];
@@ -1959,8 +1963,9 @@ int vb2_mmap(struct vb2_queue *q, struct return -EINVAL; }
- mutex_lock(&q->mmap_lock); ret = call_memop(vb, mmap, vb->planes[plane].mem_priv, vma); + +unlock: mutex_unlock(&q->mmap_lock); if (ret) return ret;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: J. Bruce Fields bfields@redhat.com
commit 81c88b18de1f11f70c97f28ced8d642c00bb3955 upstream.
If we ignore the error we'll hit a null dereference a little later.
Reported-by: syzbot+4b98281f2401ab849f4b@syzkaller.appspotmail.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/rpcb_clnt.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/net/sunrpc/rpcb_clnt.c +++ b/net/sunrpc/rpcb_clnt.c @@ -771,6 +771,12 @@ void rpcb_getport_async(struct rpc_task case RPCBVERS_3: map->r_netid = xprt->address_strings[RPC_DISPLAY_NETID]; map->r_addr = rpc_sockaddr2uaddr(sap, GFP_ATOMIC); + if (!map->r_addr) { + status = -ENOMEM; + dprintk("RPC: %5u %s: no memory available\n", + task->tk_pid, __func__); + goto bailout_free_args; + } map->r_owner = ""; break; case RPCBVERS_2: @@ -793,6 +799,8 @@ void rpcb_getport_async(struct rpc_task rpc_put_task(child); return;
+bailout_free_args: + kfree(map); bailout_release_client: rpc_release_client(rpcb_clnt); bailout_nofree:
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shakeel Butt shakeelb@google.com
commit e2c8d550a973bb34fc28bc8d0ec996f84562fb8a upstream.
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying memory is already accounted to kmemcg. Do the same for ebtables. The syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the whole system from a restricted memcg, a potential DoS.
By accounting the ebt_table_info, the memory used for ebt_table_info can be contained within the memcg of the allocating process. However the lifetime of ebt_table_info is independent of the allocating process and is tied to the network namespace. So, the oom-killer will not be able to relieve the memory pressure due to ebt_table_info memory. The memory for ebt_table_info is allocated through vmalloc. Currently vmalloc does not handle the oom-killed allocating process correctly and one large allocation can bypass memcg limit enforcement. So, with this patch, at least the small allocations will be contained. For large allocations, we need to fix vmalloc.
Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt shakeelb@google.com Reviewed-by: Kirill Tkhai ktkhai@virtuozzo.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/bridge/netfilter/ebtables.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -1137,14 +1137,16 @@ static int do_replace(struct net *net, c tmp.name[sizeof(tmp.name) - 1] = 0;
countersize = COUNTER_OFFSET(tmp.nentries) * nr_cpu_ids; - newinfo = vmalloc(sizeof(*newinfo) + countersize); + newinfo = __vmalloc(sizeof(*newinfo) + countersize, GFP_KERNEL_ACCOUNT, + PAGE_KERNEL); if (!newinfo) return -ENOMEM;
if (countersize) memset(newinfo->counters, 0, countersize);
- newinfo->entries = vmalloc(tmp.entries_size); + newinfo->entries = __vmalloc(tmp.entries_size, GFP_KERNEL_ACCOUNT, + PAGE_KERNEL); if (!newinfo->entries) { ret = -ENOMEM; goto free_newinfo;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yufen Yu yuyufen@huawei.com
commit 94a2c3a32b62e868dc1e3d854326745a7f1b8c7a upstream.
We recently got a stack by syzkaller like this:
BUG: sleeping function called from invalid context at mm/slab.h:361 in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid INFO: lockdep is turned off. CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194 0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00 0000000000000000 0000000000000001 0000000000000004 0000000000000000 Call Trace: <IRQ> [<ffffffff81cb2194>] __dump_stack lib/dump_stack.c:15 [inline] <IRQ> [<ffffffff81cb2194>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51 [<ffffffff8129a981>] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675 [<ffffffff8129ac33>] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637 [<ffffffff81794c13>] slab_pre_alloc_hook mm/slab.h:361 [inline] [<ffffffff81794c13>] slab_alloc_node mm/slub.c:2610 [inline] [<ffffffff81794c13>] slab_alloc mm/slub.c:2692 [inline] [<ffffffff81794c13>] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709 [<ffffffff81cbe9a7>] kmalloc include/linux/slab.h:479 [inline] [<ffffffff81cbe9a7>] kzalloc include/linux/slab.h:623 [inline] [<ffffffff81cbe9a7>] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227 [<ffffffff81cbf84f>] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374 [<ffffffff81cbb5b9>] kobject_cleanup lib/kobject.c:633 [inline] [<ffffffff81cbb5b9>] kobject_release+0x229/0x440 lib/kobject.c:675 [<ffffffff81cbb0a2>] kref_sub include/linux/kref.h:73 [inline] [<ffffffff81cbb0a2>] kref_put include/linux/kref.h:98 [inline] [<ffffffff81cbb0a2>] kobject_put+0x72/0xd0 lib/kobject.c:692 [<ffffffff8216f095>] put_device+0x25/0x30 drivers/base/core.c:1237 [<ffffffff81c4cc34>] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232 [<ffffffff813c08bc>] __rcu_reclaim kernel/rcu/rcu.h:118 [inline] [<ffffffff813c08bc>] rcu_do_batch kernel/rcu/tree.c:2705 [inline] [<ffffffff813c08bc>] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline] [<ffffffff813c08bc>] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline] [<ffffffff813c08bc>] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957 [<ffffffff8120f509>] __do_softirq+0x299/0xe20 kernel/softirq.c:273 [<ffffffff81210496>] invoke_softirq kernel/softirq.c:350 [inline] [<ffffffff81210496>] irq_exit+0x216/0x2c0 kernel/softirq.c:391 [<ffffffff82c2cd7b>] exiting_irq arch/x86/include/asm/apic.h:652 [inline] [<ffffffff82c2cd7b>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926 [<ffffffff82c2bc25>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746 <EOI> [<ffffffff814cbf40>] ? audit_kill_trees+0x180/0x180 [<ffffffff8187d2f7>] fd_install+0x57/0x80 fs/file.c:626 [<ffffffff8180989e>] do_sys_open+0x45e/0x550 fs/open.c:1043 [<ffffffff818099c2>] SYSC_open fs/open.c:1055 [inline] [<ffffffff818099c2>] SyS_open+0x32/0x40 fs/open.c:1050 [<ffffffff82c299e1>] entry_SYSCALL_64_fastpath+0x1e/0x9a
In softirq context, we call rcu callback function delete_partition_rcu_cb(), which may allocate memory by kzalloc with GFP_KERNEL flag. If the allocation cannot be satisfied, it may sleep. However, That is not allowed in softirq contex.
Although we found this problem on linux 4.4, the latest kernel version seems to have this problem as well. And it is very similar to the previous one: https://lkml.org/lkml/2018/7/9/391
Fix it by using RCU workqueue, which allows sleep.
Reviewed-by: Paul E. McKenney paulmck@linux.ibm.com Signed-off-by: Yufen Yu yuyufen@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- block/partition-generic.c | 8 +++++--- include/linux/genhd.h | 2 +- 2 files changed, 6 insertions(+), 4 deletions(-)
--- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -249,9 +249,10 @@ struct device_type part_type = { .uevent = part_uevent, };
-static void delete_partition_rcu_cb(struct rcu_head *head) +static void delete_partition_work_fn(struct work_struct *work) { - struct hd_struct *part = container_of(head, struct hd_struct, rcu_head); + struct hd_struct *part = container_of(to_rcu_work(work), struct hd_struct, + rcu_work);
part->start_sect = 0; part->nr_sects = 0; @@ -262,7 +263,8 @@ static void delete_partition_rcu_cb(stru void __delete_partition(struct percpu_ref *ref) { struct hd_struct *part = container_of(ref, struct hd_struct, ref); - call_rcu(&part->rcu_head, delete_partition_rcu_cb); + INIT_RCU_WORK(&part->rcu_work, delete_partition_work_fn); + queue_rcu_work(system_wq, &part->rcu_work); }
/* --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -129,7 +129,7 @@ struct hd_struct { struct disk_stats dkstats; #endif struct percpu_ref ref; - struct rcu_head rcu_head; + struct rcu_work rcu_work; };
#define GENHD_FL_REMOVABLE 1
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Smalley sds@tycho.nsa.gov
commit 5b0e7310a2a33c06edc7eb81ffc521af9b2c5610 upstream.
levdatum->level can be NULL if we encounter an error while loading the policy during sens_read prior to initializing it. Make sure sens_destroy handles that case correctly.
Reported-by: syzbot+6664500f0f18f07a5c0e@syzkaller.appspotmail.com Signed-off-by: Stephen Smalley sds@tycho.nsa.gov Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/selinux/ss/policydb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/security/selinux/ss/policydb.c +++ b/security/selinux/ss/policydb.c @@ -732,7 +732,8 @@ static int sens_destroy(void *key, void kfree(key); if (datum) { levdatum = datum; - ebitmap_destroy(&levdatum->level->cat); + if (levdatum->level) + ebitmap_destroy(&levdatum->level->cat); kfree(levdatum->level); } kfree(datum);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 04906b2f542c23626b0ef6219b808406f8dddbe9 upstream.
bd_set_size() updates also block device's block size. This is somewhat unexpected from its name and at this point, only blkdev_open() uses this functionality. Furthermore, this can result in changing block size under a filesystem mounted on a loop device which leads to livelocks inside __getblk_gfp() like:
Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 10863 Comm: syz-executor0 Not tainted 4.18.0-rc5+ #151 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x3f/0x50 kernel/kcov.c:106 ... Call Trace: init_page_buffers+0x3e2/0x530 fs/buffer.c:904 grow_dev_page fs/buffer.c:947 [inline] grow_buffers fs/buffer.c:1009 [inline] __getblk_slow fs/buffer.c:1036 [inline] __getblk_gfp+0x906/0xb10 fs/buffer.c:1313 __bread_gfp+0x2d/0x310 fs/buffer.c:1347 sb_bread include/linux/buffer_head.h:307 [inline] fat12_ent_bread+0x14e/0x3d0 fs/fat/fatent.c:75 fat_ent_read_block fs/fat/fatent.c:441 [inline] fat_alloc_clusters+0x8ce/0x16e0 fs/fat/fatent.c:489 fat_add_cluster+0x7a/0x150 fs/fat/inode.c:101 __fat_get_block fs/fat/inode.c:148 [inline] ...
Trivial reproducer for the problem looks like:
truncate -s 1G /tmp/image losetup /dev/loop0 /tmp/image mkfs.ext4 -b 1024 /dev/loop0 mount -t ext4 /dev/loop0 /mnt losetup -c /dev/loop0 l /mnt
Fix the problem by moving initialization of a block device block size into a separate function and call it when needed.
Thanks to Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp for help with debugging the problem.
Reported-by: syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/block_dev.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -104,6 +104,20 @@ void invalidate_bdev(struct block_device } EXPORT_SYMBOL(invalidate_bdev);
+static void set_init_blocksize(struct block_device *bdev) +{ + unsigned bsize = bdev_logical_block_size(bdev); + loff_t size = i_size_read(bdev->bd_inode); + + while (bsize < PAGE_SIZE) { + if (size & bsize) + break; + bsize <<= 1; + } + bdev->bd_block_size = bsize; + bdev->bd_inode->i_blkbits = blksize_bits(bsize); +} + int set_blocksize(struct block_device *bdev, int size) { /* Size must be a power of two, and between 512 and PAGE_SIZE */ @@ -1408,18 +1422,9 @@ EXPORT_SYMBOL(check_disk_change);
void bd_set_size(struct block_device *bdev, loff_t size) { - unsigned bsize = bdev_logical_block_size(bdev); - inode_lock(bdev->bd_inode); i_size_write(bdev->bd_inode, size); inode_unlock(bdev->bd_inode); - while (bsize < PAGE_SIZE) { - if (size & bsize) - break; - bsize <<= 1; - } - bdev->bd_block_size = bsize; - bdev->bd_inode->i_blkbits = blksize_bits(bsize); } EXPORT_SYMBOL(bd_set_size);
@@ -1496,8 +1501,10 @@ static int __blkdev_get(struct block_dev } }
- if (!ret) + if (!ret) { bd_set_size(bdev,(loff_t)get_capacity(disk)<<9); + set_init_blocksize(bdev); + }
/* * If the device is invalidated, rescan partition @@ -1532,6 +1539,7 @@ static int __blkdev_get(struct block_dev goto out_clear; } bd_set_size(bdev, (loff_t)bdev->bd_part->nr_sects << 9); + set_init_blocksize(bdev); }
if (bdev->bd_bdi == &noop_backing_dev_info)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
commit 400b8b9a2a17918f8ce00786f596f530e7f30d50 upstream.
The similar issue as fixed in Commit 4a2eb0c37b47 ("sctp: initialize sin6_flowinfo for ipv6 addrs in sctp_inet6addr_event") also exists in sctp_inetaddr_event, as Alexander noticed.
To fix it, allocate sctp_sockaddr_entry with kzalloc for both sctp ipv4 and ipv6 addresses, as does in sctp_v4/6_copy_addrlist().
Reported-by: Alexander Potapenko glider@google.com Signed-off-by: Xin Long lucien.xin@gmail.com Reported-by: syzbot+ae0c70c0c2d40c51bb92@syzkaller.appspotmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sctp/ipv6.c | 5 +---- net/sctp/protocol.c | 4 +--- 2 files changed, 2 insertions(+), 7 deletions(-)
--- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -97,11 +97,9 @@ static int sctp_inet6addr_event(struct n
switch (ev) { case NETDEV_UP: - addr = kmalloc(sizeof(struct sctp_sockaddr_entry), GFP_ATOMIC); + addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v6.sin6_family = AF_INET6; - addr->a.v6.sin6_port = 0; - addr->a.v6.sin6_flowinfo = 0; addr->a.v6.sin6_addr = ifa->addr; addr->a.v6.sin6_scope_id = ifa->idev->dev->ifindex; addr->valid = 1; @@ -431,7 +429,6 @@ static void sctp_v6_copy_addrlist(struct addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v6.sin6_family = AF_INET6; - addr->a.v6.sin6_port = 0; addr->a.v6.sin6_addr = ifp->addr; addr->a.v6.sin6_scope_id = dev->ifindex; addr->valid = 1; --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -101,7 +101,6 @@ static void sctp_v4_copy_addrlist(struct addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v4.sin_family = AF_INET; - addr->a.v4.sin_port = 0; addr->a.v4.sin_addr.s_addr = ifa->ifa_local; addr->valid = 1; INIT_LIST_HEAD(&addr->list); @@ -776,10 +775,9 @@ static int sctp_inetaddr_event(struct no
switch (ev) { case NETDEV_UP: - addr = kmalloc(sizeof(struct sctp_sockaddr_entry), GFP_ATOMIC); + addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v4.sin_family = AF_INET; - addr->a.v4.sin_port = 0; addr->a.v4.sin_addr.s_addr = ifa->ifa_local; addr->valid = 1; spin_lock_bh(&net->sctp.local_addr_lock);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit a88289f4ddee4165d5f796bd99e09eec3133c16b upstream.
syzbot reported:
BUG: KMSAN: uninit-value in tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 CPU: 0 PID: 66 Comm: kworker/u4:4 Not tainted 4.17.0-rc3+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_rcv tipc_conn_recv_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 tipc_conn_rcv_from_sock net/tipc/topsrv.c:409 [inline] tipc_conn_recv_work+0x3cd/0x560 net/tipc/topsrv.c:424 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412
Local variable description: ----s.i@tipc_conn_recv_work Variable was created at: tipc_conn_recv_work+0x65/0x560 net/tipc/topsrv.c:419 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145
In tipc_conn_rcv_from_sock(), it always supposes the length of message received from sock_recvmsg() is not smaller than the size of struct tipc_subscr. However, this assumption is false. Especially when the length of received message is shorter than struct tipc_subscr size, we will end up touching uninitialized fields in tipc_conn_rcv_sub().
Reported-by: syzbot+8951a3065ee7fd6d6e23@syzkaller.appspotmail.com Reported-by: syzbot+75e6e042c5bbf691fc82@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/topsrv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -404,7 +404,7 @@ static int tipc_conn_rcv_from_sock(struc ret = sock_recvmsg(con->sock, &msg, MSG_DONTWAIT); if (ret == -EWOULDBLOCK) return -EWOULDBLOCK; - if (ret > 0) { + if (ret == sizeof(s)) { read_lock_bh(&sk->sk_callback_lock); ret = tipc_conn_rcv_sub(srv, con, &s); read_unlock_bh(&sk->sk_callback_lock);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 8b66fee7f8ee18f9c51260e7a43ab37db5177a05 upstream.
syzbot reports following splat:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486 CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295 strlen+0x3b/0xa0 lib/string.c:486 nla_put_string include/net/netlink.h:1154 [inline] tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline] tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x457ec9 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2557338c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f25573396d4 R13: 00000000004cb478 R14: 00000000004d86c8 R15: 00000000ffffffff
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2759 [inline] __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383 __kmalloc_reserve net/core/skbuff.c:137 [inline] __alloc_skb+0x309/0xa20 net/core/skbuff.c:205 alloc_skb include/linux/skbuff.h:998 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
The uninitialised access happened in tipc_nl_compat_link_reset_stats: nla_put_string(skb, TIPC_NLA_LINK_NAME, name)
This is because name string is not validated before it's used.
Reported-by: syzbot+e01d94b5a4c266be6e4c@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -87,6 +87,11 @@ static int tipc_skb_tailroom(struct sk_b return limit; }
+static inline int TLV_GET_DATA_LEN(struct tlv_desc *tlv) +{ + return TLV_GET_LEN(tlv) - TLV_SPACE(0); +} + static int tipc_add_tlv(struct sk_buff *skb, u16 type, void *data, u16 len) { struct tlv_desc *tlv = (struct tlv_desc *)skb_tail_pointer(skb); @@ -166,6 +171,11 @@ static struct sk_buff *tipc_get_err_tlv( return buf; }
+static inline bool string_is_valid(char *s, int len) +{ + return memchr(s, '\0', len) ? true : false; +} + static int __tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd, struct tipc_nl_compat_msg *msg, struct sk_buff *arg) @@ -750,6 +760,7 @@ static int tipc_nl_compat_link_reset_sta { char *name; struct nlattr *link; + int len;
name = (char *)TLV_DATA(msg->req);
@@ -757,6 +768,10 @@ static int tipc_nl_compat_link_reset_sta if (!link) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + if (!string_is_valid(name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_LINK_NAME, name)) return -EMSGSIZE;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 0762216c0ad2a2fccd63890648eca491f2c83d9a upstream.
syzbot reported:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:484 CPU: 1 PID: 6371 Comm: syz-executor652 Not tainted 4.19.0-rc8+ #70 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x306/0x460 lib/dump_stack.c:113 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917 __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500 strlen+0x3b/0xa0 lib/string.c:484 nla_put_string include/net/netlink.h:1011 [inline] tipc_nl_compat_bearer_enable+0x238/0x7b0 net/tipc/netlink_compat.c:389 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x39f/0xae0 net/tipc/netlink_compat.c:344 tipc_nl_compat_recv+0x147c/0x2760 net/tipc/netlink_compat.c:1107 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x440179 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffef7beee8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00 R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline] kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180 kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2727 [inline] __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x422/0xe90 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:996 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline] netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
The root cause is that we don't validate whether bear name is a valid string in tipc_nl_compat_bearer_enable().
Meanwhile, we also fix the same issue in the following functions: tipc_nl_compat_bearer_disable() tipc_nl_compat_link_stat_dump() tipc_nl_compat_media_set() tipc_nl_compat_bearer_set()
Reported-by: syzbot+b33d5cae0efd35dbfe77@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -389,6 +389,7 @@ static int tipc_nl_compat_bearer_enable( struct nlattr *prop; struct nlattr *bearer; struct tipc_bearer_config *b; + int len;
b = (struct tipc_bearer_config *)TLV_DATA(msg->req);
@@ -396,6 +397,10 @@ static int tipc_nl_compat_bearer_enable( if (!bearer) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); + if (!string_is_valid(b->name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_BEARER_NAME, b->name)) return -EMSGSIZE;
@@ -421,6 +426,7 @@ static int tipc_nl_compat_bearer_disable { char *name; struct nlattr *bearer; + int len;
name = (char *)TLV_DATA(msg->req);
@@ -428,6 +434,10 @@ static int tipc_nl_compat_bearer_disable if (!bearer) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); + if (!string_is_valid(name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_BEARER_NAME, name)) return -EMSGSIZE;
@@ -488,6 +498,7 @@ static int tipc_nl_compat_link_stat_dump struct nlattr *prop[TIPC_NLA_PROP_MAX + 1]; struct nlattr *stats[TIPC_NLA_STATS_MAX + 1]; int err; + int len;
if (!attrs[TIPC_NLA_LINK]) return -EINVAL; @@ -514,6 +525,11 @@ static int tipc_nl_compat_link_stat_dump return err;
name = (char *)TLV_DATA(msg->req); + + len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + if (!string_is_valid(name, len)) + return -EINVAL; + if (strcmp(name, nla_data(link[TIPC_NLA_LINK_NAME])) != 0) return 0;
@@ -654,6 +670,7 @@ static int tipc_nl_compat_media_set(stru struct nlattr *prop; struct nlattr *media; struct tipc_link_config *lc; + int len;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
@@ -661,6 +678,10 @@ static int tipc_nl_compat_media_set(stru if (!media) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_MEDIA_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_MEDIA_NAME, lc->name)) return -EMSGSIZE;
@@ -681,6 +702,7 @@ static int tipc_nl_compat_bearer_set(str struct nlattr *prop; struct nlattr *bearer; struct tipc_link_config *lc; + int len;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
@@ -688,6 +710,10 @@ static int tipc_nl_compat_bearer_set(str if (!bearer) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_MEDIA_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_BEARER_NAME, lc->name)) return -EMSGSIZE;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit edf5ff04a45750ac8ce2435974f001dc9cfbf055 upstream.
syzbot reports following splat:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486 CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 strlen+0x3b/0xa0 lib/string.c:486 nla_put_string include/net/netlink.h:1154 [inline] __tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline] tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline] tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
The uninitialised access happened in nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name)
This is because lc->name string is not validated before it's used.
Reported-by: syzbot+d78b8a29241a195aefb8@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -762,9 +762,14 @@ static int tipc_nl_compat_link_set(struc struct tipc_link_config *lc; struct tipc_bearer *bearer; struct tipc_media *media; + int len;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + media = tipc_media_find(lc->name); if (media) { cmd->doit = &__tipc_nl_media_set;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 974cb0e3e7c963ced06c4e32c5b2884173fa5e01 upstream.
syzbot reported:
BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline] BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826 CPU: 0 PID: 6290 Comm: syz-executor848 Not tainted 4.19.0-rc8+ #70 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x306/0x460 lib/dump_stack.c:113 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917 __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500 __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] __fswab32 include/uapi/linux/swab.h:59 [inline] tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826 __tipc_nl_compat_dumpit+0x59e/0xdb0 net/tipc/netlink_compat.c:205 tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:270 tipc_nl_compat_handle net/tipc/netlink_compat.c:1151 [inline] tipc_nl_compat_recv+0x1402/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x440179 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffecec49318 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00 R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline] kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180 kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2727 [inline] __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x422/0xe90 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:996 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline] netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
We cannot take for granted the thing that the length of data contained in TLV is longer than the size of struct tipc_name_table_query in tipc_nl_compat_name_table_dump().
Reported-by: syzbot+06e771a754829716a327@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -824,6 +824,8 @@ static int tipc_nl_compat_name_table_dum };
ntq = (struct tipc_name_table_query *)TLV_DATA(msg->req); + if (TLV_GET_DATA_LEN(msg->req) < sizeof(struct tipc_name_table_query)) + return -EINVAL;
depth = ntohl(ntq->depth);
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 2753ca5d9009c180dbfd4c802c80983b4b6108d1 upstream.
BUG: KMSAN: uninit-value in tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335 CPU: 0 PID: 4514 Comm: syz-executor485 Not tainted 4.16.0+ #87 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335 tipc_nl_compat_recv+0x164b/0x2700 net/tipc/netlink_compat.c:1153 genl_family_rcv_msg net/netlink/genetlink.c:599 [inline] genl_rcv_msg+0x1686/0x1810 net/netlink/genetlink.c:624 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2447 genl_rcv+0x63/0x80 net/netlink/genetlink.c:635 netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline] netlink_unicast+0x166b/0x1740 net/netlink/af_netlink.c:1337 netlink_sendmsg+0x1048/0x1310 net/netlink/af_netlink.c:1900 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmsg net/socket.c:2080 [inline] SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091 SyS_sendmsg+0x54/0x80 net/socket.c:2087 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43fda9 RSP: 002b:00007ffd0c184ba8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fda9 RDX: 0000000000000000 RSI: 0000000020023000 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 00000000004016d0 R13: 0000000000401760 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1183 [inline] netlink_sendmsg+0x9a6/0x1310 net/netlink/af_netlink.c:1875 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmsg net/socket.c:2080 [inline] SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091 SyS_sendmsg+0x54/0x80 net/socket.c:2087 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
In tipc_nl_compat_recv(), when the len variable returned by nlmsg_attrlen() is 0, the message is still treated as a valid one, which is obviously unresonable. When len is zero, it means the message not only doesn't contain any valid TLV payload, but also TLV header is not included. Under this stituation, tlv_type field in TLV header is still accessed in tipc_nl_compat_dumpit() or tipc_nl_compat_doit(), but the field space is obviously illegal. Of course, it is not initialized.
Reported-by: syzbot+bca0dc46634781f08b38@syzkaller.appspotmail.com Reported-by: syzbot+6bdb590321a7ae40c1a6@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -1249,7 +1249,7 @@ static int tipc_nl_compat_recv(struct sk }
len = nlmsg_attrlen(req_nlh, GENL_HDRLEN + TIPC_GENL_HDRLEN); - if (len && !TLV_OK(msg.req, len)) { + if (!len || !TLV_OK(msg.req, len)) { msg.rep = tipc_get_err_tlv(TIPC_CFG_NOT_SUPPORTED); err = -EOPNOTSUPP; goto send;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit b1ab5fa309e6c49e4e06270ec67dd7b3e9971d04 upstream.
vfs_getattr() needs "struct path" rather than "struct file". Let's use path_get()/path_put() rather than get_file()/fput().
Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1205,7 +1205,7 @@ loop_set_status(struct loop_device *lo, static int loop_get_status(struct loop_device *lo, struct loop_info64 *info) { - struct file *file; + struct path path; struct kstat stat; int ret;
@@ -1230,16 +1230,16 @@ loop_get_status(struct loop_device *lo, }
/* Drop lo_ctl_mutex while we call into the filesystem. */ - file = get_file(lo->lo_backing_file); + path = lo->lo_backing_file->f_path; + path_get(&path); mutex_unlock(&lo->lo_ctl_mutex); - ret = vfs_getattr(&file->f_path, &stat, STATX_INO, - AT_STATX_SYNC_AS_STAT); + ret = vfs_getattr(&path, &stat, STATX_INO, AT_STATX_SYNC_AS_STAT); if (!ret) { info->lo_device = huge_encode_dev(stat.dev); info->lo_inode = stat.ino; info->lo_rdevice = huge_encode_dev(stat.rdev); } - fput(file); + path_put(&path); return ret; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 310ca162d779efee8a2dc3731439680f3e9c1e86 upstream.
syzbot is reporting NULL pointer dereference [1] which is caused by race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other loop devices at loop_validate_file() without holding corresponding lo->lo_ctl_mutex locks.
Since ioctl() request on loop devices is not frequent operation, we don't need fine grained locking. Let's use global lock in order to allow safe traversal at loop_validate_file().
Note that syzbot is also reporting circular locking dependency between bdev->bd_mutex and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part() with lock held. This patch does not address it.
[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a... [2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588...
Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot syzbot+bf89c128e05dd6c62523@syzkaller.appspotmail.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 58 +++++++++++++++++++++++++-------------------------- drivers/block/loop.h | 1 2 files changed, 29 insertions(+), 30 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -84,6 +84,7 @@
static DEFINE_IDR(loop_index_idr); static DEFINE_MUTEX(loop_index_mutex); +static DEFINE_MUTEX(loop_ctl_mutex);
static int max_part; static int part_shift; @@ -1047,7 +1048,7 @@ static int loop_clr_fd(struct loop_devic */ if (atomic_read(&lo->lo_refcnt) > 1) { lo->lo_flags |= LO_FLAGS_AUTOCLEAR; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return 0; }
@@ -1100,12 +1101,12 @@ static int loop_clr_fd(struct loop_devic if (!part_shift) lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; loop_unprepare_queue(lo); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); /* - * Need not hold lo_ctl_mutex to fput backing file. - * Calling fput holding lo_ctl_mutex triggers a circular + * Need not hold loop_ctl_mutex to fput backing file. + * Calling fput holding loop_ctl_mutex triggers a circular * lock dependency possibility warning as fput can take - * bd_mutex which is usually taken before lo_ctl_mutex. + * bd_mutex which is usually taken before loop_ctl_mutex. */ fput(filp); return 0; @@ -1210,7 +1211,7 @@ loop_get_status(struct loop_device *lo, int ret;
if (lo->lo_state != Lo_bound) { - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return -ENXIO; }
@@ -1229,10 +1230,10 @@ loop_get_status(struct loop_device *lo, lo->lo_encrypt_key_size); }
- /* Drop lo_ctl_mutex while we call into the filesystem. */ + /* Drop loop_ctl_mutex while we call into the filesystem. */ path = lo->lo_backing_file->f_path; path_get(&path); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); ret = vfs_getattr(&path, &stat, STATX_INO, AT_STATX_SYNC_AS_STAT); if (!ret) { info->lo_device = huge_encode_dev(stat.dev); @@ -1324,7 +1325,7 @@ loop_get_status_old(struct loop_device * int err;
if (!arg) { - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return -EINVAL; } err = loop_get_status(lo, &info64); @@ -1342,7 +1343,7 @@ loop_get_status64(struct loop_device *lo int err;
if (!arg) { - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return -EINVAL; } err = loop_get_status(lo, &info64); @@ -1400,7 +1401,7 @@ static int lo_ioctl(struct block_device struct loop_device *lo = bdev->bd_disk->private_data; int err;
- err = mutex_lock_killable_nested(&lo->lo_ctl_mutex, 1); + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); if (err) goto out_unlocked;
@@ -1412,7 +1413,7 @@ static int lo_ioctl(struct block_device err = loop_change_fd(lo, bdev, arg); break; case LOOP_CLR_FD: - /* loop_clr_fd would have unlocked lo_ctl_mutex on success */ + /* loop_clr_fd would have unlocked loop_ctl_mutex on success */ err = loop_clr_fd(lo); if (!err) goto out_unlocked; @@ -1425,7 +1426,7 @@ static int lo_ioctl(struct block_device break; case LOOP_GET_STATUS: err = loop_get_status_old(lo, (struct loop_info __user *) arg); - /* loop_get_status() unlocks lo_ctl_mutex */ + /* loop_get_status() unlocks loop_ctl_mutex */ goto out_unlocked; case LOOP_SET_STATUS64: err = -EPERM; @@ -1435,7 +1436,7 @@ static int lo_ioctl(struct block_device break; case LOOP_GET_STATUS64: err = loop_get_status64(lo, (struct loop_info64 __user *) arg); - /* loop_get_status() unlocks lo_ctl_mutex */ + /* loop_get_status() unlocks loop_ctl_mutex */ goto out_unlocked; case LOOP_SET_CAPACITY: err = -EPERM; @@ -1455,7 +1456,7 @@ static int lo_ioctl(struct block_device default: err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; } - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex);
out_unlocked: return err; @@ -1572,7 +1573,7 @@ loop_get_status_compat(struct loop_devic int err;
if (!arg) { - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return -EINVAL; } err = loop_get_status(lo, &info64); @@ -1589,19 +1590,19 @@ static int lo_compat_ioctl(struct block_
switch(cmd) { case LOOP_SET_STATUS: - err = mutex_lock_killable(&lo->lo_ctl_mutex); + err = mutex_lock_killable(&loop_ctl_mutex); if (!err) { err = loop_set_status_compat(lo, (const struct compat_loop_info __user *)arg); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); } break; case LOOP_GET_STATUS: - err = mutex_lock_killable(&lo->lo_ctl_mutex); + err = mutex_lock_killable(&loop_ctl_mutex); if (!err) { err = loop_get_status_compat(lo, (struct compat_loop_info __user *)arg); - /* loop_get_status() unlocks lo_ctl_mutex */ + /* loop_get_status() unlocks loop_ctl_mutex */ } break; case LOOP_SET_CAPACITY: @@ -1648,7 +1649,7 @@ static void __lo_release(struct loop_dev if (atomic_dec_return(&lo->lo_refcnt)) return;
- mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { /* * In autoclear mode, stop the loop thread @@ -1666,7 +1667,7 @@ static void __lo_release(struct loop_dev blk_mq_unfreeze_queue(lo->lo_queue); }
- mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); }
static void lo_release(struct gendisk *disk, fmode_t mode) @@ -1712,10 +1713,10 @@ static int unregister_transfer_cb(int id struct loop_device *lo = ptr; struct loop_func_table *xfer = data;
- mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); if (lo->lo_encryption == xfer) loop_release_xfer(lo); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return 0; }
@@ -1896,7 +1897,6 @@ static int loop_add(struct loop_device * if (!part_shift) disk->flags |= GENHD_FL_NO_PART_SCAN; disk->flags |= GENHD_FL_EXT_DEVT; - mutex_init(&lo->lo_ctl_mutex); atomic_set(&lo->lo_refcnt, 0); lo->lo_number = i; spin_lock_init(&lo->lo_lock); @@ -2009,21 +2009,21 @@ static long loop_control_ioctl(struct fi ret = loop_lookup(&lo, parm); if (ret < 0) break; - ret = mutex_lock_killable(&lo->lo_ctl_mutex); + ret = mutex_lock_killable(&loop_ctl_mutex); if (ret) break; if (lo->lo_state != Lo_unbound) { ret = -EBUSY; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); break; } if (atomic_read(&lo->lo_refcnt) > 0) { ret = -EBUSY; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); break; } lo->lo_disk->private_data = NULL; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); idr_remove(&loop_index_idr, lo->lo_number); loop_remove(lo); break; --- a/drivers/block/loop.h +++ b/drivers/block/loop.h @@ -54,7 +54,6 @@ struct loop_device {
spinlock_t lo_lock; int lo_state; - struct mutex lo_ctl_mutex; struct kthread_worker worker; struct task_struct *worker_task; bool use_dio;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 967d1dc144b50ad005e5eecdfadfbcfb399ffff6 upstream.
__loop_release() has a single call site. Fold it there. This is currently not a huge win but it will make following replacement of loop_index_mutex more obvious.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1642,12 +1642,15 @@ out: return err; }
-static void __lo_release(struct loop_device *lo) +static void lo_release(struct gendisk *disk, fmode_t mode) { + struct loop_device *lo; int err;
+ mutex_lock(&loop_index_mutex); + lo = disk->private_data; if (atomic_dec_return(&lo->lo_refcnt)) - return; + goto unlock_index;
mutex_lock(&loop_ctl_mutex); if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { @@ -1657,7 +1660,7 @@ static void __lo_release(struct loop_dev */ err = loop_clr_fd(lo); if (!err) - return; + goto unlock_index; } else if (lo->lo_state == Lo_bound) { /* * Otherwise keep thread (if running) and config, @@ -1668,12 +1671,7 @@ static void __lo_release(struct loop_dev }
mutex_unlock(&loop_ctl_mutex); -} - -static void lo_release(struct gendisk *disk, fmode_t mode) -{ - mutex_lock(&loop_index_mutex); - __lo_release(disk->private_data); +unlock_index: mutex_unlock(&loop_index_mutex); }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 0a42e99b58a208839626465af194cfe640ef9493 upstream.
Now that loop_ctl_mutex is global, just get rid of loop_index_mutex as there is no good reason to keep these two separate and it just complicates the locking.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 41 ++++++++++++++++++++--------------------- 1 file changed, 20 insertions(+), 21 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -83,7 +83,6 @@ #include <linux/uaccess.h>
static DEFINE_IDR(loop_index_idr); -static DEFINE_MUTEX(loop_index_mutex); static DEFINE_MUTEX(loop_ctl_mutex);
static int max_part; @@ -1627,9 +1626,11 @@ static int lo_compat_ioctl(struct block_ static int lo_open(struct block_device *bdev, fmode_t mode) { struct loop_device *lo; - int err = 0; + int err;
- mutex_lock(&loop_index_mutex); + err = mutex_lock_killable(&loop_ctl_mutex); + if (err) + return err; lo = bdev->bd_disk->private_data; if (!lo) { err = -ENXIO; @@ -1638,7 +1639,7 @@ static int lo_open(struct block_device *
atomic_inc(&lo->lo_refcnt); out: - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex); return err; }
@@ -1647,12 +1648,11 @@ static void lo_release(struct gendisk *d struct loop_device *lo; int err;
- mutex_lock(&loop_index_mutex); + mutex_lock(&loop_ctl_mutex); lo = disk->private_data; if (atomic_dec_return(&lo->lo_refcnt)) - goto unlock_index; + goto out_unlock;
- mutex_lock(&loop_ctl_mutex); if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { /* * In autoclear mode, stop the loop thread @@ -1660,7 +1660,7 @@ static void lo_release(struct gendisk *d */ err = loop_clr_fd(lo); if (!err) - goto unlock_index; + return; } else if (lo->lo_state == Lo_bound) { /* * Otherwise keep thread (if running) and config, @@ -1670,9 +1670,8 @@ static void lo_release(struct gendisk *d blk_mq_unfreeze_queue(lo->lo_queue); }
+out_unlock: mutex_unlock(&loop_ctl_mutex); -unlock_index: - mutex_unlock(&loop_index_mutex); }
static const struct block_device_operations lo_fops = { @@ -1973,7 +1972,7 @@ static struct kobject *loop_probe(dev_t struct kobject *kobj; int err;
- mutex_lock(&loop_index_mutex); + mutex_lock(&loop_ctl_mutex); err = loop_lookup(&lo, MINOR(dev) >> part_shift); if (err < 0) err = loop_add(&lo, MINOR(dev) >> part_shift); @@ -1981,7 +1980,7 @@ static struct kobject *loop_probe(dev_t kobj = NULL; else kobj = get_disk_and_module(lo->lo_disk); - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex);
*part = 0; return kobj; @@ -1991,9 +1990,13 @@ static long loop_control_ioctl(struct fi unsigned long parm) { struct loop_device *lo; - int ret = -ENOSYS; + int ret;
- mutex_lock(&loop_index_mutex); + ret = mutex_lock_killable(&loop_ctl_mutex); + if (ret) + return ret; + + ret = -ENOSYS; switch (cmd) { case LOOP_CTL_ADD: ret = loop_lookup(&lo, parm); @@ -2007,9 +2010,6 @@ static long loop_control_ioctl(struct fi ret = loop_lookup(&lo, parm); if (ret < 0) break; - ret = mutex_lock_killable(&loop_ctl_mutex); - if (ret) - break; if (lo->lo_state != Lo_unbound) { ret = -EBUSY; mutex_unlock(&loop_ctl_mutex); @@ -2021,7 +2021,6 @@ static long loop_control_ioctl(struct fi break; } lo->lo_disk->private_data = NULL; - mutex_unlock(&loop_ctl_mutex); idr_remove(&loop_index_idr, lo->lo_number); loop_remove(lo); break; @@ -2031,7 +2030,7 @@ static long loop_control_ioctl(struct fi break; ret = loop_add(&lo, -1); } - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex);
return ret; } @@ -2115,10 +2114,10 @@ static int __init loop_init(void) THIS_MODULE, loop_probe, NULL, NULL);
/* pre-create number of devices given by config or max_loop */ - mutex_lock(&loop_index_mutex); + mutex_lock(&loop_ctl_mutex); for (i = 0; i < nr; i++) loop_add(&lo, i); - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex);
printk(KERN_INFO "loop: module loaded\n"); return 0;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit a13165441d58b216adbd50252a9cc829d78a6bce upstream.
Push acquisition of lo_ctl_mutex down into individual ioctl handling branches. This is a preparatory step for pushing the lock down into individual ioctl handling functions so that they can release the lock as they need it. We also factor out some simple ioctl handlers that will not need any special handling to reduce unnecessary code duplication.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 88 ++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 63 insertions(+), 25 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1394,70 +1394,108 @@ static int loop_set_block_size(struct lo return 0; }
-static int lo_ioctl(struct block_device *bdev, fmode_t mode, - unsigned int cmd, unsigned long arg) +static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd, + unsigned long arg) { - struct loop_device *lo = bdev->bd_disk->private_data; int err;
err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); if (err) - goto out_unlocked; + return err; + switch (cmd) { + case LOOP_SET_CAPACITY: + err = loop_set_capacity(lo); + break; + case LOOP_SET_DIRECT_IO: + err = loop_set_dio(lo, arg); + break; + case LOOP_SET_BLOCK_SIZE: + err = loop_set_block_size(lo, arg); + break; + default: + err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; + } + mutex_unlock(&loop_ctl_mutex); + return err; +} + +static int lo_ioctl(struct block_device *bdev, fmode_t mode, + unsigned int cmd, unsigned long arg) +{ + struct loop_device *lo = bdev->bd_disk->private_data; + int err;
switch (cmd) { case LOOP_SET_FD: + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; err = loop_set_fd(lo, mode, bdev, arg); + mutex_unlock(&loop_ctl_mutex); break; case LOOP_CHANGE_FD: + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; err = loop_change_fd(lo, bdev, arg); + mutex_unlock(&loop_ctl_mutex); break; case LOOP_CLR_FD: + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; /* loop_clr_fd would have unlocked loop_ctl_mutex on success */ err = loop_clr_fd(lo); - if (!err) - goto out_unlocked; + if (err) + mutex_unlock(&loop_ctl_mutex); break; case LOOP_SET_STATUS: err = -EPERM; - if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) + if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; err = loop_set_status_old(lo, (struct loop_info __user *)arg); + mutex_unlock(&loop_ctl_mutex); + } break; case LOOP_GET_STATUS: + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; err = loop_get_status_old(lo, (struct loop_info __user *) arg); /* loop_get_status() unlocks loop_ctl_mutex */ - goto out_unlocked; + break; case LOOP_SET_STATUS64: err = -EPERM; - if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) + if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; err = loop_set_status64(lo, (struct loop_info64 __user *) arg); + mutex_unlock(&loop_ctl_mutex); + } break; case LOOP_GET_STATUS64: + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; err = loop_get_status64(lo, (struct loop_info64 __user *) arg); /* loop_get_status() unlocks loop_ctl_mutex */ - goto out_unlocked; - case LOOP_SET_CAPACITY: - err = -EPERM; - if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_capacity(lo); break; + case LOOP_SET_CAPACITY: case LOOP_SET_DIRECT_IO: - err = -EPERM; - if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_dio(lo, arg); - break; case LOOP_SET_BLOCK_SIZE: - err = -EPERM; - if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) - err = loop_set_block_size(lo, arg); - break; + if (!(mode & FMODE_WRITE) && !capable(CAP_SYS_ADMIN)) + return -EPERM; + /* Fall through */ default: - err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; + err = lo_simple_ioctl(lo, cmd, arg); + break; } - mutex_unlock(&loop_ctl_mutex);
-out_unlocked: return err; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit a2505b799a496b7b84d9a4a14ec870ff9e42e11b upstream.
Move setting of lo_state to Lo_rundown out into the callers. That will allow us to unlock loop_ctl_mutex while the loop device is protected from other changes by its special state.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 52 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 31 insertions(+), 21 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -976,7 +976,7 @@ static int loop_set_fd(struct loop_devic loop_reread_partitions(lo, bdev);
/* Grab the block_device to prevent its destruction after we - * put /dev/loopXX inode. Later in loop_clr_fd() we bdput(bdev). + * put /dev/loopXX inode. Later in __loop_clr_fd() we bdput(bdev). */ bdgrab(bdev); return 0; @@ -1026,31 +1026,15 @@ loop_init_xfer(struct loop_device *lo, s return err; }
-static int loop_clr_fd(struct loop_device *lo) +static int __loop_clr_fd(struct loop_device *lo) { struct file *filp = lo->lo_backing_file; gfp_t gfp = lo->old_gfp_mask; struct block_device *bdev = lo->lo_device;
- if (lo->lo_state != Lo_bound) + if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) return -ENXIO;
- /* - * If we've explicitly asked to tear down the loop device, - * and it has an elevated reference count, set it for auto-teardown when - * the last reference goes away. This stops $!~#$@ udev from - * preventing teardown because it decided that it needs to run blkid on - * the loopback device whenever they appear. xfstests is notorious for - * failing tests because blkid via udev races with a losetup - * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d - * command to fail with EBUSY. - */ - if (atomic_read(&lo->lo_refcnt) > 1) { - lo->lo_flags |= LO_FLAGS_AUTOCLEAR; - mutex_unlock(&loop_ctl_mutex); - return 0; - } - if (filp == NULL) return -EINVAL;
@@ -1058,7 +1042,6 @@ static int loop_clr_fd(struct loop_devic blk_mq_freeze_queue(lo->lo_queue);
spin_lock_irq(&lo->lo_lock); - lo->lo_state = Lo_rundown; lo->lo_backing_file = NULL; spin_unlock_irq(&lo->lo_lock);
@@ -1111,6 +1094,30 @@ static int loop_clr_fd(struct loop_devic return 0; }
+static int loop_clr_fd(struct loop_device *lo) +{ + if (lo->lo_state != Lo_bound) + return -ENXIO; + /* + * If we've explicitly asked to tear down the loop device, + * and it has an elevated reference count, set it for auto-teardown when + * the last reference goes away. This stops $!~#$@ udev from + * preventing teardown because it decided that it needs to run blkid on + * the loopback device whenever they appear. xfstests is notorious for + * failing tests because blkid via udev races with a losetup + * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d + * command to fail with EBUSY. + */ + if (atomic_read(&lo->lo_refcnt) > 1) { + lo->lo_flags |= LO_FLAGS_AUTOCLEAR; + mutex_unlock(&loop_ctl_mutex); + return 0; + } + lo->lo_state = Lo_rundown; + + return __loop_clr_fd(lo); +} + static int loop_set_status(struct loop_device *lo, const struct loop_info64 *info) { @@ -1692,11 +1699,14 @@ static void lo_release(struct gendisk *d goto out_unlock;
if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { + if (lo->lo_state != Lo_bound) + goto out_unlock; + lo->lo_state = Lo_rundown; /* * In autoclear mode, stop the loop thread * and remove configuration after last close. */ - err = loop_clr_fd(lo); + err = __loop_clr_fd(lo); if (!err) return; } else if (lo->lo_state == Lo_bound) {
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 7ccd0791d98531df7cd59e92d55e4f063d48a070 upstream.
loop_clr_fd() has a weird locking convention that is expects loop_ctl_mutex held, releases it on success and keeps it on failure. Untangle the mess by moving locking of loop_ctl_mutex into loop_clr_fd().
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 49 +++++++++++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 20 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1028,15 +1028,22 @@ loop_init_xfer(struct loop_device *lo, s
static int __loop_clr_fd(struct loop_device *lo) { - struct file *filp = lo->lo_backing_file; + struct file *filp = NULL; gfp_t gfp = lo->old_gfp_mask; struct block_device *bdev = lo->lo_device; + int err = 0;
- if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) - return -ENXIO; + mutex_lock(&loop_ctl_mutex); + if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) { + err = -ENXIO; + goto out_unlock; + }
- if (filp == NULL) - return -EINVAL; + filp = lo->lo_backing_file; + if (filp == NULL) { + err = -EINVAL; + goto out_unlock; + }
/* freeze request queue during the transition */ blk_mq_freeze_queue(lo->lo_queue); @@ -1083,6 +1090,7 @@ static int __loop_clr_fd(struct loop_dev if (!part_shift) lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; loop_unprepare_queue(lo); +out_unlock: mutex_unlock(&loop_ctl_mutex); /* * Need not hold loop_ctl_mutex to fput backing file. @@ -1090,14 +1098,22 @@ static int __loop_clr_fd(struct loop_dev * lock dependency possibility warning as fput can take * bd_mutex which is usually taken before loop_ctl_mutex. */ - fput(filp); - return 0; + if (filp) + fput(filp); + return err; }
static int loop_clr_fd(struct loop_device *lo) { - if (lo->lo_state != Lo_bound) + int err; + + err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; + if (lo->lo_state != Lo_bound) { + mutex_unlock(&loop_ctl_mutex); return -ENXIO; + } /* * If we've explicitly asked to tear down the loop device, * and it has an elevated reference count, set it for auto-teardown when @@ -1114,6 +1130,7 @@ static int loop_clr_fd(struct loop_devic return 0; } lo->lo_state = Lo_rundown; + mutex_unlock(&loop_ctl_mutex);
return __loop_clr_fd(lo); } @@ -1448,14 +1465,7 @@ static int lo_ioctl(struct block_device mutex_unlock(&loop_ctl_mutex); break; case LOOP_CLR_FD: - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; - /* loop_clr_fd would have unlocked loop_ctl_mutex on success */ - err = loop_clr_fd(lo); - if (err) - mutex_unlock(&loop_ctl_mutex); - break; + return loop_clr_fd(lo); case LOOP_SET_STATUS: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { @@ -1691,7 +1701,6 @@ out: static void lo_release(struct gendisk *disk, fmode_t mode) { struct loop_device *lo; - int err;
mutex_lock(&loop_ctl_mutex); lo = disk->private_data; @@ -1702,13 +1711,13 @@ static void lo_release(struct gendisk *d if (lo->lo_state != Lo_bound) goto out_unlock; lo->lo_state = Lo_rundown; + mutex_unlock(&loop_ctl_mutex); /* * In autoclear mode, stop the loop thread * and remove configuration after last close. */ - err = __loop_clr_fd(lo); - if (!err) - return; + __loop_clr_fd(lo); + return; } else if (lo->lo_state == Lo_bound) { /* * Otherwise keep thread (if running) and config,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 4a5ce9ba5877e4640200d84a735361306ad1a1b8 upstream.
Push loop_ctl_mutex down to loop_get_status() to avoid the unusual convention that the function gets called with loop_ctl_mutex held and releases it.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 37 ++++++++++--------------------------- 1 file changed, 10 insertions(+), 27 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1233,6 +1233,9 @@ loop_get_status(struct loop_device *lo, struct kstat stat; int ret;
+ ret = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (ret) + return ret; if (lo->lo_state != Lo_bound) { mutex_unlock(&loop_ctl_mutex); return -ENXIO; @@ -1347,10 +1350,8 @@ loop_get_status_old(struct loop_device * struct loop_info64 info64; int err;
- if (!arg) { - mutex_unlock(&loop_ctl_mutex); + if (!arg) return -EINVAL; - } err = loop_get_status(lo, &info64); if (!err) err = loop_info64_to_old(&info64, &info); @@ -1365,10 +1366,8 @@ loop_get_status64(struct loop_device *lo struct loop_info64 info64; int err;
- if (!arg) { - mutex_unlock(&loop_ctl_mutex); + if (!arg) return -EINVAL; - } err = loop_get_status(lo, &info64); if (!err && copy_to_user(arg, &info64, sizeof(info64))) err = -EFAULT; @@ -1478,12 +1477,7 @@ static int lo_ioctl(struct block_device } break; case LOOP_GET_STATUS: - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; - err = loop_get_status_old(lo, (struct loop_info __user *) arg); - /* loop_get_status() unlocks loop_ctl_mutex */ - break; + return loop_get_status_old(lo, (struct loop_info __user *) arg); case LOOP_SET_STATUS64: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { @@ -1496,12 +1490,7 @@ static int lo_ioctl(struct block_device } break; case LOOP_GET_STATUS64: - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; - err = loop_get_status64(lo, (struct loop_info64 __user *) arg); - /* loop_get_status() unlocks loop_ctl_mutex */ - break; + return loop_get_status64(lo, (struct loop_info64 __user *) arg); case LOOP_SET_CAPACITY: case LOOP_SET_DIRECT_IO: case LOOP_SET_BLOCK_SIZE: @@ -1626,10 +1615,8 @@ loop_get_status_compat(struct loop_devic struct loop_info64 info64; int err;
- if (!arg) { - mutex_unlock(&loop_ctl_mutex); + if (!arg) return -EINVAL; - } err = loop_get_status(lo, &info64); if (!err) err = loop_info64_to_compat(&info64, arg); @@ -1652,12 +1639,8 @@ static int lo_compat_ioctl(struct block_ } break; case LOOP_GET_STATUS: - err = mutex_lock_killable(&loop_ctl_mutex); - if (!err) { - err = loop_get_status_compat(lo, - (struct compat_loop_info __user *)arg); - /* loop_get_status() unlocks loop_ctl_mutex */ - } + err = loop_get_status_compat(lo, + (struct compat_loop_info __user *)arg); break; case LOOP_SET_CAPACITY: case LOOP_CLR_FD:
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 550df5fdacff94229cde0ed9b8085155654c1696 upstream.
Push loop_ctl_mutex down to loop_set_status(). We will need this to be able to call loop_reread_partitions() without loop_ctl_mutex.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 51 +++++++++++++++++++++++++-------------------------- 1 file changed, 25 insertions(+), 26 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1142,46 +1142,55 @@ loop_set_status(struct loop_device *lo, struct loop_func_table *xfer; kuid_t uid = current_uid();
+ err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (err) + return err; if (lo->lo_encrypt_key_size && !uid_eq(lo->lo_key_owner, uid) && - !capable(CAP_SYS_ADMIN)) - return -EPERM; - if (lo->lo_state != Lo_bound) - return -ENXIO; - if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) - return -EINVAL; + !capable(CAP_SYS_ADMIN)) { + err = -EPERM; + goto out_unlock; + } + if (lo->lo_state != Lo_bound) { + err = -ENXIO; + goto out_unlock; + } + if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) { + err = -EINVAL; + goto out_unlock; + }
/* I/O need to be drained during transfer transition */ blk_mq_freeze_queue(lo->lo_queue);
err = loop_release_xfer(lo); if (err) - goto exit; + goto out_unfreeze;
if (info->lo_encrypt_type) { unsigned int type = info->lo_encrypt_type;
if (type >= MAX_LO_CRYPT) { err = -EINVAL; - goto exit; + goto out_unfreeze; } xfer = xfer_funcs[type]; if (xfer == NULL) { err = -EINVAL; - goto exit; + goto out_unfreeze; } } else xfer = NULL;
err = loop_init_xfer(lo, xfer, info); if (err) - goto exit; + goto out_unfreeze;
if (lo->lo_offset != info->lo_offset || lo->lo_sizelimit != info->lo_sizelimit) { if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) { err = -EFBIG; - goto exit; + goto out_unfreeze; } }
@@ -1213,7 +1222,7 @@ loop_set_status(struct loop_device *lo, /* update dio if lo_offset or transfer is changed */ __loop_update_dio(lo, lo->use_dio);
- exit: +out_unfreeze: blk_mq_unfreeze_queue(lo->lo_queue);
if (!err && (info->lo_flags & LO_FLAGS_PARTSCAN) && @@ -1222,6 +1231,8 @@ loop_set_status(struct loop_device *lo, lo->lo_disk->flags &= ~GENHD_FL_NO_PART_SCAN; loop_reread_partitions(lo, lo->lo_device); } +out_unlock: + mutex_unlock(&loop_ctl_mutex);
return err; } @@ -1468,12 +1479,8 @@ static int lo_ioctl(struct block_device case LOOP_SET_STATUS: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; err = loop_set_status_old(lo, (struct loop_info __user *)arg); - mutex_unlock(&loop_ctl_mutex); } break; case LOOP_GET_STATUS: @@ -1481,12 +1488,8 @@ static int lo_ioctl(struct block_device case LOOP_SET_STATUS64: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; err = loop_set_status64(lo, (struct loop_info64 __user *) arg); - mutex_unlock(&loop_ctl_mutex); } break; case LOOP_GET_STATUS64: @@ -1631,12 +1634,8 @@ static int lo_compat_ioctl(struct block_
switch(cmd) { case LOOP_SET_STATUS: - err = mutex_lock_killable(&loop_ctl_mutex); - if (!err) { - err = loop_set_status_compat(lo, - (const struct compat_loop_info __user *)arg); - mutex_unlock(&loop_ctl_mutex); - } + err = loop_set_status_compat(lo, + (const struct compat_loop_info __user *)arg); break; case LOOP_GET_STATUS: err = loop_get_status_compat(lo,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 757ecf40b7e029529768eb5f9562d5eeb3002106 upstream.
Push lo_ctl_mutex down to loop_set_fd(). We will need this to be able to call loop_reread_partitions() without lo_ctl_mutex.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -919,13 +919,17 @@ static int loop_set_fd(struct loop_devic if (!file) goto out;
+ error = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (error) + goto out_putf; + error = -EBUSY; if (lo->lo_state != Lo_unbound) - goto out_putf; + goto out_unlock;
error = loop_validate_file(file, bdev); if (error) - goto out_putf; + goto out_unlock;
mapping = file->f_mapping; inode = mapping->host; @@ -937,10 +941,10 @@ static int loop_set_fd(struct loop_devic error = -EFBIG; size = get_loop_size(lo, file); if ((loff_t)(sector_t)size != size) - goto out_putf; + goto out_unlock; error = loop_prepare_queue(lo); if (error) - goto out_putf; + goto out_unlock;
error = 0;
@@ -979,11 +983,14 @@ static int loop_set_fd(struct loop_devic * put /dev/loopXX inode. Later in __loop_clr_fd() we bdput(bdev). */ bdgrab(bdev); + mutex_unlock(&loop_ctl_mutex); return 0;
- out_putf: +out_unlock: + mutex_unlock(&loop_ctl_mutex); +out_putf: fput(file); - out: +out: /* This is safe: open() is still holding a reference. */ module_put(THIS_MODULE); return error; @@ -1461,12 +1468,7 @@ static int lo_ioctl(struct block_device
switch (cmd) { case LOOP_SET_FD: - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; - err = loop_set_fd(lo, mode, bdev, arg); - mutex_unlock(&loop_ctl_mutex); - break; + return loop_set_fd(lo, mode, bdev, arg); case LOOP_CHANGE_FD: err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); if (err)
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit c371077000f4138ee3c15fbed50101ff24bdc91d upstream.
Push loop_ctl_mutex down to loop_change_fd(). We will need this to be able to call loop_reread_partitions() without loop_ctl_mutex.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -692,19 +692,22 @@ static int loop_change_fd(struct loop_de struct file *file, *old_file; int error;
+ error = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + if (error) + return error; error = -ENXIO; if (lo->lo_state != Lo_bound) - goto out; + goto out_unlock;
/* the loop device has to be read-only */ error = -EINVAL; if (!(lo->lo_flags & LO_FLAGS_READ_ONLY)) - goto out; + goto out_unlock;
error = -EBADF; file = fget(arg); if (!file) - goto out; + goto out_unlock;
error = loop_validate_file(file, bdev); if (error) @@ -731,11 +734,13 @@ static int loop_change_fd(struct loop_de fput(old_file); if (lo->lo_flags & LO_FLAGS_PARTSCAN) loop_reread_partitions(lo, bdev); + mutex_unlock(&loop_ctl_mutex); return 0;
- out_putf: +out_putf: fput(file); - out: +out_unlock: + mutex_unlock(&loop_ctl_mutex); return error; }
@@ -1470,12 +1475,7 @@ static int lo_ioctl(struct block_device case LOOP_SET_FD: return loop_set_fd(lo, mode, bdev, arg); case LOOP_CHANGE_FD: - err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); - if (err) - return err; - err = loop_change_fd(lo, bdev, arg); - mutex_unlock(&loop_ctl_mutex); - break; + return loop_change_fd(lo, bdev, arg); case LOOP_CLR_FD: return loop_clr_fd(lo); case LOOP_SET_STATUS:
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit d57f3374ba4817f7c8d26fae8a13d20ac8d31b92 upstream.
The call of __blkdev_reread_part() from loop_reread_partition() happens only when we need to invalidate partitions from loop_release(). Thus move a detection for this into loop_clr_fd() and simplify loop_reread_partition().
This makes loop_reread_partition() safe to use without loop_ctl_mutex because we use only lo->lo_number and lo->lo_file_name in case of error for reporting purposes (thus possibly reporting outdate information is not a big deal) and we are safe from 'lo' going away under us by elevated lo->lo_refcnt.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 33 +++++++++++++++++++-------------- 1 file changed, 19 insertions(+), 14 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -631,18 +631,7 @@ static void loop_reread_partitions(struc { int rc;
- /* - * bd_mutex has been held already in release path, so don't - * acquire it if this function is called in such case. - * - * If the reread partition isn't from release path, lo_refcnt - * must be at least one and it can only become zero when the - * current holder is released. - */ - if (!atomic_read(&lo->lo_refcnt)) - rc = __blkdev_reread_part(bdev); - else - rc = blkdev_reread_part(bdev); + rc = blkdev_reread_part(bdev); if (rc) pr_warn("%s: partition scan of loop%d (%s) failed (rc=%d)\n", __func__, lo->lo_number, lo->lo_file_name, rc); @@ -1096,8 +1085,24 @@ static int __loop_clr_fd(struct loop_dev module_put(THIS_MODULE); blk_mq_unfreeze_queue(lo->lo_queue);
- if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev) - loop_reread_partitions(lo, bdev); + if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev) { + /* + * bd_mutex has been held already in release path, so don't + * acquire it if this function is called in such case. + * + * If the reread partition isn't from release path, lo_refcnt + * must be at least one and it can only become zero when the + * current holder is released. + */ + if (!atomic_read(&lo->lo_refcnt)) + err = __blkdev_reread_part(bdev); + else + err = blkdev_reread_part(bdev); + pr_warn("%s: partition scan of loop%d failed (rc=%d)\n", + __func__, lo->lo_number, err); + /* Device is gone, no point in returning error */ + err = 0; + } lo->lo_flags = 0; if (!part_shift) lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 85b0a54a82e4fbceeb1aebb7cb6909edd1a24668 upstream.
Calling loop_reread_partitions() under loop_ctl_mutex causes lockdep to complain about circular lock dependency between bdev->bd_mutex and lo->lo_ctl_mutex. The problem is that on loop device open or close lo_open() and lo_release() get called with bdev->bd_mutex held and they need to acquire loop_ctl_mutex. OTOH when loop_reread_partitions() is called with loop_ctl_mutex held, it will call blkdev_reread_part() which acquires bdev->bd_mutex. See syzbot report for details [1].
Move all calls of loop_rescan_partitions() out of loop_ctl_mutex to avoid lockdep warning and fix deadlock possibility.
[1] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588
Reported-by: syzbot syzbot+4684a000d5abdade83fac55b1e7d1f935ef1936e@syzkaller.appspotmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -680,6 +680,7 @@ static int loop_change_fd(struct loop_de { struct file *file, *old_file; int error; + bool partscan;
error = mutex_lock_killable_nested(&loop_ctl_mutex, 1); if (error) @@ -721,9 +722,10 @@ static int loop_change_fd(struct loop_de blk_mq_unfreeze_queue(lo->lo_queue);
fput(old_file); - if (lo->lo_flags & LO_FLAGS_PARTSCAN) - loop_reread_partitions(lo, bdev); + partscan = lo->lo_flags & LO_FLAGS_PARTSCAN; mutex_unlock(&loop_ctl_mutex); + if (partscan) + loop_reread_partitions(lo, bdev); return 0;
out_putf: @@ -904,6 +906,7 @@ static int loop_set_fd(struct loop_devic int lo_flags = 0; int error; loff_t size; + bool partscan;
/* This is safe, since we have a reference from open(). */ __module_get(THIS_MODULE); @@ -970,14 +973,15 @@ static int loop_set_fd(struct loop_devic lo->lo_state = Lo_bound; if (part_shift) lo->lo_flags |= LO_FLAGS_PARTSCAN; - if (lo->lo_flags & LO_FLAGS_PARTSCAN) - loop_reread_partitions(lo, bdev); + partscan = lo->lo_flags & LO_FLAGS_PARTSCAN;
/* Grab the block_device to prevent its destruction after we * put /dev/loopXX inode. Later in __loop_clr_fd() we bdput(bdev). */ bdgrab(bdev); mutex_unlock(&loop_ctl_mutex); + if (partscan) + loop_reread_partitions(lo, bdev); return 0;
out_unlock: @@ -1158,6 +1162,8 @@ loop_set_status(struct loop_device *lo, int err; struct loop_func_table *xfer; kuid_t uid = current_uid(); + struct block_device *bdev; + bool partscan = false;
err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); if (err) @@ -1246,10 +1252,13 @@ out_unfreeze: !(lo->lo_flags & LO_FLAGS_PARTSCAN)) { lo->lo_flags |= LO_FLAGS_PARTSCAN; lo->lo_disk->flags &= ~GENHD_FL_NO_PART_SCAN; - loop_reread_partitions(lo, lo->lo_device); + bdev = lo->lo_device; + partscan = true; } out_unlock: mutex_unlock(&loop_ctl_mutex); + if (partscan) + loop_reread_partitions(lo, bdev);
return err; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 0da03cab87e6323ff2e05b14bc7d5c6fcc531efd upstream.
Calling blkdev_reread_part() under loop_ctl_mutex causes lockdep to complain about circular lock dependency between bdev->bd_mutex and lo->lo_ctl_mutex. The problem is that on loop device open or close lo_open() and lo_release() get called with bdev->bd_mutex held and they need to acquire loop_ctl_mutex. OTOH when loop_reread_partitions() is called with loop_ctl_mutex held, it will call blkdev_reread_part() which acquires bdev->bd_mutex. See syzbot report for details [1].
Move call to blkdev_reread_part() in __loop_clr_fd() from under loop_ctl_mutex to finish fixing of the lockdep warning and the possible deadlock.
[1] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588
Reported-by: syzbot syzbot+4684a000d5abdade83fac55b1e7d1f935ef1936e@syzkaller.appspotmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1031,12 +1031,14 @@ loop_init_xfer(struct loop_device *lo, s return err; }
-static int __loop_clr_fd(struct loop_device *lo) +static int __loop_clr_fd(struct loop_device *lo, bool release) { struct file *filp = NULL; gfp_t gfp = lo->old_gfp_mask; struct block_device *bdev = lo->lo_device; int err = 0; + bool partscan = false; + int lo_number;
mutex_lock(&loop_ctl_mutex); if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) { @@ -1089,7 +1091,15 @@ static int __loop_clr_fd(struct loop_dev module_put(THIS_MODULE); blk_mq_unfreeze_queue(lo->lo_queue);
- if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev) { + partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev; + lo_number = lo->lo_number; + lo->lo_flags = 0; + if (!part_shift) + lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; + loop_unprepare_queue(lo); +out_unlock: + mutex_unlock(&loop_ctl_mutex); + if (partscan) { /* * bd_mutex has been held already in release path, so don't * acquire it if this function is called in such case. @@ -1098,21 +1108,15 @@ static int __loop_clr_fd(struct loop_dev * must be at least one and it can only become zero when the * current holder is released. */ - if (!atomic_read(&lo->lo_refcnt)) + if (release) err = __blkdev_reread_part(bdev); else err = blkdev_reread_part(bdev); pr_warn("%s: partition scan of loop%d failed (rc=%d)\n", - __func__, lo->lo_number, err); + __func__, lo_number, err); /* Device is gone, no point in returning error */ err = 0; } - lo->lo_flags = 0; - if (!part_shift) - lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; - loop_unprepare_queue(lo); -out_unlock: - mutex_unlock(&loop_ctl_mutex); /* * Need not hold loop_ctl_mutex to fput backing file. * Calling fput holding loop_ctl_mutex triggers a circular @@ -1153,7 +1157,7 @@ static int loop_clr_fd(struct loop_devic lo->lo_state = Lo_rundown; mutex_unlock(&loop_ctl_mutex);
- return __loop_clr_fd(lo); + return __loop_clr_fd(lo, false); }
static int @@ -1714,7 +1718,7 @@ static void lo_release(struct gendisk *d * In autoclear mode, stop the loop thread * and remove configuration after last close. */ - __loop_clr_fd(lo); + __loop_clr_fd(lo, true); return; } else if (lo->lo_state == Lo_bound) { /*
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 1dded9acf6dc9a34cd27fcf8815507e4e65b3c4f upstream.
Code in loop_change_fd() drops reference to the old file (and also the new file in a failure case) under loop_ctl_mutex. Similarly to a situation in loop_set_fd() this can create a circular locking dependency if this was the last reference holding the file open. Delay dropping of the file reference until we have released loop_ctl_mutex.
Reported-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -678,7 +678,7 @@ static int loop_validate_file(struct fil static int loop_change_fd(struct loop_device *lo, struct block_device *bdev, unsigned int arg) { - struct file *file, *old_file; + struct file *file = NULL, *old_file; int error; bool partscan;
@@ -687,21 +687,21 @@ static int loop_change_fd(struct loop_de return error; error = -ENXIO; if (lo->lo_state != Lo_bound) - goto out_unlock; + goto out_err;
/* the loop device has to be read-only */ error = -EINVAL; if (!(lo->lo_flags & LO_FLAGS_READ_ONLY)) - goto out_unlock; + goto out_err;
error = -EBADF; file = fget(arg); if (!file) - goto out_unlock; + goto out_err;
error = loop_validate_file(file, bdev); if (error) - goto out_putf; + goto out_err;
old_file = lo->lo_backing_file;
@@ -709,7 +709,7 @@ static int loop_change_fd(struct loop_de
/* size of the new backing store needs to be the same */ if (get_loop_size(lo, file) != get_loop_size(lo, old_file)) - goto out_putf; + goto out_err;
/* and ... switch */ blk_mq_freeze_queue(lo->lo_queue); @@ -720,18 +720,22 @@ static int loop_change_fd(struct loop_de lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); loop_update_dio(lo); blk_mq_unfreeze_queue(lo->lo_queue); - - fput(old_file); partscan = lo->lo_flags & LO_FLAGS_PARTSCAN; mutex_unlock(&loop_ctl_mutex); + /* + * We must drop file reference outside of loop_ctl_mutex as dropping + * the file ref can take bd_mutex which creates circular locking + * dependency. + */ + fput(old_file); if (partscan) loop_reread_partitions(lo, bdev); return 0;
-out_putf: - fput(file); -out_unlock: +out_err: mutex_unlock(&loop_ctl_mutex); + if (file) + fput(file); return error; }
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit c28445fa06a3a54e06938559b9514c5a7f01c90f upstream.
The nested acquisition of loop_ctl_mutex (->lo_ctl_mutex back then) has been introduced by commit f028f3b2f987e "loop: fix circular locking in loop_clr_fd()" to fix lockdep complains about bd_mutex being acquired after lo_ctl_mutex during partition rereading. Now that these are properly fixed, let's stop fooling lockdep.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -682,7 +682,7 @@ static int loop_change_fd(struct loop_de int error; bool partscan;
- error = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + error = mutex_lock_killable(&loop_ctl_mutex); if (error) return error; error = -ENXIO; @@ -920,7 +920,7 @@ static int loop_set_fd(struct loop_devic if (!file) goto out;
- error = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + error = mutex_lock_killable(&loop_ctl_mutex); if (error) goto out_putf;
@@ -1136,7 +1136,7 @@ static int loop_clr_fd(struct loop_devic { int err;
- err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + err = mutex_lock_killable(&loop_ctl_mutex); if (err) return err; if (lo->lo_state != Lo_bound) { @@ -1173,7 +1173,7 @@ loop_set_status(struct loop_device *lo, struct block_device *bdev; bool partscan = false;
- err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + err = mutex_lock_killable(&loop_ctl_mutex); if (err) return err; if (lo->lo_encrypt_key_size && @@ -1278,7 +1278,7 @@ loop_get_status(struct loop_device *lo, struct kstat stat; int ret;
- ret = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + ret = mutex_lock_killable(&loop_ctl_mutex); if (ret) return ret; if (lo->lo_state != Lo_bound) { @@ -1467,7 +1467,7 @@ static int lo_simple_ioctl(struct loop_d { int err;
- err = mutex_lock_killable_nested(&loop_ctl_mutex, 1); + err = mutex_lock_killable(&loop_ctl_mutex); if (err) return err; switch (cmd) {
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 628bd85947091830a8c4872adfd5ed1d515a9cf2 upstream.
Commit 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") forgot to remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when replacing loop_index_mutex with loop_ctl_mutex.
Fixes: 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") Reported-by: syzbot syzbot+c0138741c2290fc5e63f@syzkaller.appspotmail.com Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -2075,12 +2075,10 @@ static long loop_control_ioctl(struct fi break; if (lo->lo_state != Lo_unbound) { ret = -EBUSY; - mutex_unlock(&loop_ctl_mutex); break; } if (atomic_read(&lo->lo_refcnt) > 0) { ret = -EBUSY; - mutex_unlock(&loop_ctl_mutex); break; } lo->lo_disk->private_data = NULL;
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaegeuk Kim jaegeuk@kernel.org
commit 5db470e229e22b7eda6e23b5566e532c96fb5bc3 upstream.
If we don't drop caches used in old offset or block_size, we can get old data from new offset/block_size, which gives unexpected data to user.
For example, Martijn found a loopback bug in the below scenario. 1) LOOP_SET_FD loads first two pages on loop file 2) LOOP_SET_STATUS64 changes the offset on the loop file 3) mount is failed due to the cached pages having wrong superblock
Cc: Jens Axboe axboe@kernel.dk Cc: linux-block@vger.kernel.org Reported-by: Martijn Coenen maco@google.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1191,6 +1191,12 @@ loop_set_status(struct loop_device *lo, goto out_unlock; }
+ if (lo->lo_offset != info->lo_offset || + lo->lo_sizelimit != info->lo_sizelimit) { + sync_blockdev(lo->lo_device); + kill_bdev(lo->lo_device); + } + /* I/O need to be drained during transfer transition */ blk_mq_freeze_queue(lo->lo_queue);
@@ -1219,6 +1225,14 @@ loop_set_status(struct loop_device *lo,
if (lo->lo_offset != info->lo_offset || lo->lo_sizelimit != info->lo_sizelimit) { + /* kill_bdev should have truncated all the pages */ + if (lo->lo_device->bd_inode->i_mapping->nrpages) { + err = -EAGAIN; + pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", + __func__, lo->lo_number, lo->lo_file_name, + lo->lo_device->bd_inode->i_mapping->nrpages); + goto out_unfreeze; + } if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) { err = -EFBIG; goto out_unfreeze; @@ -1444,22 +1458,39 @@ static int loop_set_dio(struct loop_devi
static int loop_set_block_size(struct loop_device *lo, unsigned long arg) { + int err = 0; + if (lo->lo_state != Lo_bound) return -ENXIO;
if (arg < 512 || arg > PAGE_SIZE || !is_power_of_2(arg)) return -EINVAL;
+ if (lo->lo_queue->limits.logical_block_size != arg) { + sync_blockdev(lo->lo_device); + kill_bdev(lo->lo_device); + } + blk_mq_freeze_queue(lo->lo_queue);
+ /* kill_bdev should have truncated all the pages */ + if (lo->lo_queue->limits.logical_block_size != arg && + lo->lo_device->bd_inode->i_mapping->nrpages) { + err = -EAGAIN; + pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", + __func__, lo->lo_number, lo->lo_file_name, + lo->lo_device->bd_inode->i_mapping->nrpages); + goto out_unfreeze; + } + blk_queue_logical_block_size(lo->lo_queue, arg); blk_queue_physical_block_size(lo->lo_queue, arg); blk_queue_io_min(lo->lo_queue, arg); loop_update_dio(lo); - +out_unfreeze: blk_mq_unfreeze_queue(lo->lo_queue);
- return 0; + return err; }
static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ivan Mironov mironov.ivan@gmail.com
commit 66a8d5bfb518f9f12d47e1d2dce1732279f9451e upstream.
Strict requirement of pixclock to be zero breaks support of SDL 1.2 which contains hardcoded table of supported video modes with non-zero pixclock values[1].
To better understand which pixclock values are considered valid and how driver should handle these values, I briefly examined few existing fbdev drivers and documentation in Documentation/fb/. And it looks like there are no strict rules on that and actual behaviour varies:
* some drivers treat (pixclock == 0) as "use defaults" (uvesafb.c); * some treat (pixclock == 0) as invalid value which leads to -EINVAL (clps711x-fb.c); * some pass converted pixclock value to hardware (uvesafb.c); * some are trying to find nearest value from predefined table (vga16fb.c, video_gx.c).
Given this, I believe that it should be safe to just ignore this value if changing is not supported. It seems that any portable fbdev application which was not written only for one specific device working under one specific kernel version should not rely on any particular behaviour of pixclock anyway.
However, while enabling SDL1 applications to work out of the box when there is no /etc/fb.modes with valid settings, this change affects the video mode choosing logic in SDL. Depending on current screen resolution, contents of /etc/fb.modes and resolution requested by application, this may lead to user-visible difference (not always): image will be displayed in a right way, but it will be aligned to the left instead of center. There is no "right behaviour" here as well, as emulated fbdev, opposing to old fbdev drivers, simply ignores any requsts of video mode changes with resolutions smaller than current.
The easiest way to reproduce this problem is to install sdl-sopwith[2], remove /etc/fb.modes file if it exists, and then try to run sopwith from console without X. At least in Fedora 29, sopwith may be simply installed from standard repositories.
[1] SDL 1.2.15 source code, src/video/fbcon/SDL_fbvideo.c, vesa_timings [2] http://sdl-sopwith.sourceforge.net/
Signed-off-by: Ivan Mironov mironov.ivan@gmail.com Cc: stable@vger.kernel.org Fixes: 79e539453b34e ("DRM: i915: add mode setting support") Fixes: 771fe6b912fca ("drm/radeon: introduce kernel modesetting for radeon hardware") Fixes: 785b93ef8c309 ("drm/kms: move driver specific fb common code to helper functions (v2)") Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20190108072353.28078-3-mironov... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/drm_fb_helper.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -1690,9 +1690,14 @@ int drm_fb_helper_check_var(struct fb_va struct drm_fb_helper *fb_helper = info->par; struct drm_framebuffer *fb = fb_helper->fb;
- if (var->pixclock != 0 || in_dbg_master()) + if (in_dbg_master()) return -EINVAL;
+ if (var->pixclock != 0) { + DRM_DEBUG("fbdev emulation doesn't support changing the pixel clock, value of pixclock is ignored\n"); + var->pixclock = 0; + } + /* * Changes struct fb_var_screeninfo are currently not pushed back * to KMS, hence fail if different settings are requested.
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuah Khan shuah@kernel.org
commit 211929fd3f7c8de4d541b1cc243b82830e5ea1e8 upstream.
Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk") added khdr target to run headers_install target from the main Makefile. The logic uses KSFT_KHDR_INSTALL and top_srcdir as controls to initialize variables and include files to run headers_install from the top level Makefile. There are a few problems with this logic.
1. Exposes top_srcdir to all tests 2. Common logic impacts all tests 3. Uses KSFT_KHDR_INSTALL, top_srcdir, and khdr in an adhoc way. Tests add "khdr" dependency in their Makefiles to TEST_PROGS_EXTENDED in some cases, and STATIC_LIBS in other cases. This makes this framework confusing to use.
The common logic that runs for all tests even when KSFT_KHDR_INSTALL isn't defined by the test. top_srcdir is initialized to a default value when test doesn't initialize it. It works for all tests without a sub-dir structure and tests with sub-dir structure fail to build.
e.g: make -C sparc64/drivers/ or make -C drivers/dma-buf
../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory make: *** No rule to make target '../../../../scripts/subarch.include'. Stop.
There is no reason to require all tests to define top_srcdir and there is no need to require tests to add khdr dependency using adhoc changes to TEST_* and other variables.
Fix it with a consistent use of KSFT_KHDR_INSTALL and top_srcdir from tests that have the dependency on headers_install.
Change common logic to include khdr target define and "all" target with dependency on khdr when KSFT_KHDR_INSTALL is defined.
Only tests that have dependency on headers_install have to define just the KSFT_KHDR_INSTALL, and top_srcdir variables and there is no need to specify khdr dependency in the test Makefiles.
Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk") Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan shuah@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/testing/selftests/android/Makefile | 2 +- tools/testing/selftests/futex/functional/Makefile | 1 + tools/testing/selftests/gpio/Makefile | 1 + tools/testing/selftests/kvm/Makefile | 2 +- tools/testing/selftests/lib.mk | 8 ++++---- tools/testing/selftests/networking/timestamping/Makefile | 1 + tools/testing/selftests/vm/Makefile | 1 + 7 files changed, 10 insertions(+), 6 deletions(-)
--- a/tools/testing/selftests/android/Makefile +++ b/tools/testing/selftests/android/Makefile @@ -6,7 +6,7 @@ TEST_PROGS := run.sh
include ../lib.mk
-all: khdr +all: @for DIR in $(SUBDIRS); do \ BUILD_TARGET=$(OUTPUT)/$$DIR; \ mkdir $$BUILD_TARGET -p; \ --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -19,6 +19,7 @@ TEST_GEN_FILES := \ TEST_PROGS := run.sh
top_srcdir = ../../../../.. +KSFT_KHDR_INSTALL := 1 include ../../lib.mk
$(TEST_GEN_FILES): $(HEADERS) --- a/tools/testing/selftests/gpio/Makefile +++ b/tools/testing/selftests/gpio/Makefile @@ -9,6 +9,7 @@ EXTRA_OBJS := ../gpiogpio-event-mon-in.o EXTRA_OBJS += ../gpiogpio-hammer-in.o ../gpiogpio-utils.o ../gpiolsgpio-in.o EXTRA_OBJS += ../gpiolsgpio.o
+KSFT_KHDR_INSTALL := 1 include ../lib.mk
all: $(BINARIES) --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -1,6 +1,7 @@ all:
top_srcdir = ../../../../ +KSFT_KHDR_INSTALL := 1 UNAME_M := $(shell uname -m)
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c @@ -40,4 +41,3 @@ $(OUTPUT)/libkvm.a: $(LIBKVM_OBJ)
all: $(STATIC_LIBS) $(TEST_GEN_PROGS): $(STATIC_LIBS) -$(STATIC_LIBS):| khdr --- a/tools/testing/selftests/lib.mk +++ b/tools/testing/selftests/lib.mk @@ -16,18 +16,18 @@ TEST_GEN_PROGS := $(patsubst %,$(OUTPUT) TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED)) TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES))
+ifdef KSFT_KHDR_INSTALL top_srcdir ?= ../../../.. include $(top_srcdir)/scripts/subarch.include ARCH ?= $(SUBARCH)
-all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) - .PHONY: khdr khdr: make ARCH=$(ARCH) -C $(top_srcdir) headers_install
-ifdef KSFT_KHDR_INSTALL -$(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES):| khdr +all: khdr $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) +else +all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) endif
.ONESHELL: --- a/tools/testing/selftests/networking/timestamping/Makefile +++ b/tools/testing/selftests/networking/timestamping/Makefile @@ -6,6 +6,7 @@ TEST_PROGS := hwtstamp_config rxtimestam all: $(TEST_PROGS)
top_srcdir = ../../../../.. +KSFT_KHDR_INSTALL := 1 include ../../lib.mk
clean: --- a/tools/testing/selftests/vm/Makefile +++ b/tools/testing/selftests/vm/Makefile @@ -24,6 +24,7 @@ TEST_GEN_FILES += virtual_address_range
TEST_PROGS := run_vmtests
+KSFT_KHDR_INSTALL := 1 include ../lib.mk
$(OUTPUT)/userfaultfd: LDLIBS += -lpthread
On Mon, 21 Jan 2019 at 19:30, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.19.17 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 13:48:56 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.17-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.19.17-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.19.y git commit: 23a84dfade5b350d25ab6afd6a96244a46572511 git describe: v4.19.16-102-g23a84dfade5b Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.16-10...
No regressions (compared to build v4.19.16)
No fixes (compared to build v4.19.16)
Ran 14175 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm
Test Suites ----------- * boot * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test * ltp-open-posix-tests
On Mon, Jan 21, 2019 at 02:47:52PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.17 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 13:48:56 UTC 2019. Anything received after that time might be too late.
Build results: total: 156 pass: 156 fail: 0 Qemu test results: total: 343 pass: 343 fail: 0
Guenter
On 1/21/19 6:47 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.19.17 release. There are 99 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 13:48:56 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.17-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org