This is the start of the stable review cycle for the 4.9.152 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 12:24:02 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.152-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.9.152-rc1
Michal Hocko mhocko@suse.com mm, memcg: fix reclaim deadlock with writeback
Ivan Mironov mironov.ivan@gmail.com drm/fb-helper: Ignore the value of fb_var_screeninfo.pixclock
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()
Jan Kara jack@suse.cz loop: Get rid of loop_index_mutex
Jan Kara jack@suse.cz loop: Fold __loop_release into loop_release
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp block/loop: Use global lock for ioctl() operation.
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_doit
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_name_table_dump
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_link_set
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_bearer_enable
Ying Xue ying.xue@windriver.com tipc: fix uninit-value in tipc_nl_compat_link_reset_stats
Xin Long lucien.xin@gmail.com sctp: allocate sctp_sockaddr_entry with kzalloc
Jan Kara jack@suse.cz blockdev: Fix livelocks on loop device
Stephen Smalley sds@tycho.nsa.gov selinux: fix GPF on invalid policy
Shakeel Butt shakeelb@google.com netfilter: ebtables: account ebt_table_info to kmemcg
J. Bruce Fields bfields@redhat.com sunrpc: handle ENOMEM in rpcb_getport_async
Hans Verkuil hverkuil@xs4all.nl media: vb2: vb2_mmap: move lock up
James Morris james.morris@microsoft.com LSM: Check for NULL cred-security on free
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: set min width/height to a value > 0
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: fix error handling of kthread_run
Vlad Tsyrklevich vlad@tsyrklevich.net omap2fb: Fix stack memory disclosure
YunQiang Su ysu@wavecomp.com Disable MSI also when pcie-octeon.pcie_disable on
Ard Biesheuvel ard.biesheuvel@linaro.org arm64: kaslr: ensure randomized quantities are clean to the PoC
Jonathan Hunter jonathanh@nvidia.com mfd: tps6586x: Handle interrupts on suspend
Arnd Bergmann arnd@arndb.de mips: fix n32 compat_ipc_parse_version
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - fix ablkcipher for CONFIG_VMAP_STACK
Christophe Leroy christophe.leroy@c-s.fr crypto: talitos - reorder code in talitos_edesc_alloc()
Ivan Mironov mironov.ivan@gmail.com scsi: sd: Fix cache_type_store()
Stanley Chu stanley.chu@mediatek.com scsi: core: Synchronize request queue PM status only on successful resume
Kees Cook keescook@chromium.org Yama: Check for pid death before checking ancestry
Josef Bacik josef@toxicpanda.com btrfs: wait on ordered extents on abort cleanup
Eric Biggers ebiggers@google.com crypto: authenc - fix parsing key with misaligned rta_len
Harsh Jain harsh@chelsio.com crypto: authencesn - Avoid twice completion call in decrypt path
Aymen Sghaier aymen.sghaier@nxp.com crypto: caam - fix zero-length buffer DMA mapping
Willem de Bruijn willemb@google.com ip: on queued skb use skb_header_pointer instead of pskb_may_pull
Willem de Bruijn willemb@google.com bonding: update nest level on unlink
Jason Gunthorpe jgg@mellanox.com packet: Do not leak dev refcounts on error exit
JianJhen Chen kchen@synology.com net: bridge: fix a bug on using a neighbour cache entry without checking its state
Eric Dumazet edumazet@google.com ipv6: fix kernel-infoleak in ipv6_local_error()
Mark Rutland mark.rutland@arm.com arm64: Don't trap host pointer auth use to EL2
Mark Rutland mark.rutland@arm.com arm64/kvm: consistently handle host HCR_EL2 flags
Varun Prakash varun@chelsio.com scsi: target: iscsi: cxgbit: fix csk leak
Sasha Levin sashal@kernel.org Revert "scsi: target: iscsi: cxgbit: fix csk leak"
Guenter Roeck linux@roeck-us.net proc: Remove empty line in /proc/self/status
Ben Hutchings ben@decadent.org.uk media: em28xx: Fix misplaced reset of dev->v4l::field_count
Chao Yu yuchao0@huawei.com Revert "f2fs: do not recover from previous remained wrong dnodes"
Oliver Hartkopp socketcan@hartkopp.net can: gw: ensure DLC boundaries after CAN frame modification
Dmitry Safonov dima@arista.com tty: Don't hold ldisc lock in tty_reopen() if ldisc present
Dmitry Safonov dima@arista.com tty: Simplify tty->count math in tty_reopen()
Dmitry Safonov dima@arista.com tty: Hold tty_ldisc_lock() during tty_reopen()
Dmitry Safonov dima@arista.com tty/ldsem: Wake up readers after timed out down_write()
-------------
Diffstat:
Makefile | 4 +- arch/arm64/include/asm/kvm_arm.h | 3 + arch/arm64/kernel/head.S | 5 +- arch/arm64/kernel/kaslr.c | 8 ++- arch/arm64/kvm/hyp/switch.c | 2 +- arch/mips/Kconfig | 1 + arch/mips/pci/msi-octeon.c | 4 +- crypto/authenc.c | 14 ++++- crypto/authencesn.c | 2 +- drivers/block/loop.c | 79 ++++++++++++------------ drivers/block/loop.h | 1 - drivers/crypto/caam/caamhash.c | 15 +++-- drivers/crypto/talitos.c | 27 +++----- drivers/gpu/drm/drm_fb_helper.c | 7 ++- drivers/media/platform/vivid/vivid-kthread-cap.c | 5 +- drivers/media/platform/vivid/vivid-kthread-out.c | 5 +- drivers/media/platform/vivid/vivid-vid-common.c | 2 +- drivers/media/usb/em28xx/em28xx-video.c | 4 +- drivers/media/v4l2-core/videobuf2-core.c | 11 +++- drivers/mfd/tps6586x.c | 24 +++++++ drivers/net/bonding/bond_main.c | 3 + drivers/scsi/scsi_pm.c | 26 ++++---- drivers/scsi/sd.c | 6 ++ drivers/tty/tty_io.c | 22 ++++--- drivers/tty/tty_ldsem.c | 10 +++ drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c | 2 + fs/block_dev.c | 28 ++++++--- fs/btrfs/disk-io.c | 8 +++ fs/f2fs/recovery.c | 31 +--------- fs/proc/array.c | 3 +- mm/memory.c | 22 +++++++ net/bridge/br_netfilter_hooks.c | 2 +- net/bridge/netfilter/ebtables.c | 6 +- net/can/gw.c | 30 ++++++++- net/ipv4/ip_sockglue.c | 12 ++-- net/ipv6/datagram.c | 11 ++-- net/packet/af_packet.c | 4 +- net/sctp/ipv6.c | 5 +- net/sctp/protocol.c | 4 +- net/sunrpc/rpcb_clnt.c | 8 +++ net/tipc/netlink_compat.c | 50 ++++++++++++++- security/security.c | 7 +++ security/selinux/ss/policydb.c | 3 +- security/yama/yama_lsm.c | 4 +- 44 files changed, 352 insertions(+), 178 deletions(-)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit 231f8fd0cca078bd4396dd7e380db813ac5736e2 upstream.
ldsem_down_read() will sleep if there is pending writer in the queue. If the writer times out, readers in the queue should be woken up, otherwise they may miss a chance to acquire the semaphore until the last active reader will do ldsem_up_read().
There was a couple of reports where there was one active reader and other readers soft locked up: Showing all locks held in the system: 2 locks held by khungtaskd/17: #0: (rcu_read_lock){......}, at: watchdog+0x124/0x6d1 #1: (tasklist_lock){.+.+..}, at: debug_show_all_locks+0x72/0x2d3 2 locks held by askfirst/123: #0: (&tty->ldisc_sem){.+.+.+}, at: ldsem_down_read+0x46/0x58 #1: (&ldata->atomic_read_lock){+.+...}, at: n_tty_read+0x115/0xbe4
Prevent readers wait for active readers to release ldisc semaphore.
Link: lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com Link: lkml.kernel.org/r/20180907045041.GF1110@shao2-debian Cc: Jiri Slaby jslaby@suse.com Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Reported-by: kernel test robot rong.a.chen@intel.com Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_ldsem.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/tty/tty_ldsem.c +++ b/drivers/tty/tty_ldsem.c @@ -307,6 +307,16 @@ down_write_failed(struct ld_semaphore *s if (!locked) ldsem_atomic_update(-LDSEM_WAIT_BIAS, sem); list_del(&waiter.list); + + /* + * In case of timeout, wake up every reader who gave the right of way + * to writer. Prevent separation readers into two groups: + * one that helds semaphore and another that sleeps. + * (in case of no contention with a writer) + */ + if (!locked && list_empty(&sem->write_wait)) + __ldsem_wake_readers(sem); + raw_spin_unlock_irq(&sem->wait_lock);
__set_task_state(tsk, TASK_RUNNING);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit 83d817f41070c48bc3eb7ec18e43000a548fca5c upstream.
tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup() nor set_ldisc() nor tty_ldisc_release() as they use tty lock. But it races with anyone who expects line discipline to be the same after hoding read semaphore in tty_ldisc_ref().
We've seen the following crash on v4.9.108 stable:
BUG: unable to handle kernel paging request at 0000000000002260 IP: [..] n_tty_receive_buf_common+0x5f/0x86d Workqueue: events_unbound flush_to_ldisc Call Trace: [..] n_tty_receive_buf2 [..] tty_ldisc_receive_buf [..] flush_to_ldisc [..] process_one_work [..] worker_thread [..] kthread [..] ret_from_fork
tty_ldisc_reinit() should be called with ldisc_sem hold for writing, which will protect any reader against line discipline changes.
Cc: Jiri Slaby jslaby@suse.com Cc: stable@vger.kernel.org # b027e2298bd5 ("tty: fix data race between tty_init_dev and flush of buf") Reviewed-by: Jiri Slaby jslaby@suse.cz Reported-by: syzbot+3aa9784721dfb90e984d@syzkaller.appspotmail.com Tested-by: Mark Rutland mark.rutland@arm.com Tested-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Signed-off-by: Dmitry Safonov dima@arista.com Tested-by: Tycho Andersen tycho@tycho.ws Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_io.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1487,15 +1487,20 @@ static int tty_reopen(struct tty_struct if (test_bit(TTY_EXCLUSIVE, &tty->flags) && !capable(CAP_SYS_ADMIN)) return -EBUSY;
- tty->count++; + retval = tty_ldisc_lock(tty, 5 * HZ); + if (retval) + return retval;
+ tty->count++; if (tty->ldisc) - return 0; + goto out_unlock;
retval = tty_ldisc_reinit(tty, tty->termios.c_line); if (retval) tty->count--;
+out_unlock: + tty_ldisc_unlock(tty); return retval; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit cf62a1a13749db0d32b5cdd800ea91a4087319de upstream.
As notted by Jiri, tty_ldisc_reinit() shouldn't rely on tty counter. Simplify math by increasing the counter after reinit success.
Cc: Jiri Slaby jslaby@suse.com Link: lkml.kernel.org/r/20180829022353.23568-2-dima@arista.com Suggested-by: Jiri Slaby jslaby@suse.com Reviewed-by: Jiri Slaby jslaby@suse.cz Tested-by: Mark Rutland mark.rutland@arm.com Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_io.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1491,16 +1491,13 @@ static int tty_reopen(struct tty_struct if (retval) return retval;
- tty->count++; - if (tty->ldisc) - goto out_unlock; + if (!tty->ldisc) + retval = tty_ldisc_reinit(tty, tty->termios.c_line); + tty_ldisc_unlock(tty);
- retval = tty_ldisc_reinit(tty, tty->termios.c_line); - if (retval) - tty->count--; + if (retval == 0) + tty->count++;
-out_unlock: - tty_ldisc_unlock(tty); return retval; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
commit d3736d82e8169768218ee0ef68718875918091a0 upstream.
Try to get reference for ldisc during tty_reopen(). If ldisc present, we don't need to do tty_ldisc_reinit() and lock the write side for line discipline semaphore. Effectively, it optimizes fast-path for tty_reopen(), but more importantly it won't interrupt ongoing IO on the tty as no ldisc change is needed. Fixes user-visible issue when tty_reopen() interrupted login process for user with a long password, observed and reported by Lukas.
Fixes: c96cf923a98d ("tty: Don't block on IO when ldisc change is pending") Fixes: 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()") Cc: Jiri Slaby jslaby@suse.com Reported-by: Lukas F. Hartmann lukas@mntmn.com Tested-by: Lukas F. Hartmann lukas@mntmn.com Cc: stable stable@vger.kernel.org Signed-off-by: Dmitry Safonov dima@arista.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/tty/tty_io.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
--- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1475,7 +1475,8 @@ static void tty_driver_remove_tty(struct static int tty_reopen(struct tty_struct *tty) { struct tty_driver *driver = tty->driver; - int retval; + struct tty_ldisc *ld; + int retval = 0;
if (driver->type == TTY_DRIVER_TYPE_PTY && driver->subtype == PTY_TYPE_MASTER) @@ -1487,13 +1488,18 @@ static int tty_reopen(struct tty_struct if (test_bit(TTY_EXCLUSIVE, &tty->flags) && !capable(CAP_SYS_ADMIN)) return -EBUSY;
- retval = tty_ldisc_lock(tty, 5 * HZ); - if (retval) - return retval; - - if (!tty->ldisc) - retval = tty_ldisc_reinit(tty, tty->termios.c_line); - tty_ldisc_unlock(tty); + ld = tty_ldisc_ref_wait(tty); + if (ld) { + tty_ldisc_deref(ld); + } else { + retval = tty_ldisc_lock(tty, 5 * HZ); + if (retval) + return retval; + + if (!tty->ldisc) + retval = tty_ldisc_reinit(tty, tty->termios.c_line); + tty_ldisc_unlock(tty); + }
if (retval == 0) tty->count++;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Hartkopp socketcan@hartkopp.net
commit 0aaa81377c5a01f686bcdb8c7a6929a7bf330c68 upstream.
Muyu Yu provided a POC where user root with CAP_NET_ADMIN can create a CAN frame modification rule that makes the data length code a higher value than the available CAN frame data size. In combination with a configured checksum calculation where the result is stored relatively to the end of the data (e.g. cgw_csum_xor_rel) the tail of the skb (e.g. frag_list pointer in skb_shared_info) can be rewritten which finally can cause a system crash.
Michael Kubecek suggested to drop frames that have a DLC exceeding the available space after the modification process and provided a patch that can handle CAN FD frames too. Within this patch we also limit the length for the checksum calculations to the maximum of Classic CAN data length (8).
CAN frames that are dropped by these additional checks are counted with the CGW_DELETED counter which indicates misconfigurations in can-gw rules.
This fixes CVE-2019-3701.
Reported-by: Muyu Yu ieatmuttonchuan@gmail.com Reported-by: Marcus Meissner meissner@suse.de Suggested-by: Michal Kubecek mkubecek@suse.cz Tested-by: Muyu Yu ieatmuttonchuan@gmail.com Tested-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Cc: linux-stable stable@vger.kernel.org # >= v3.2 Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/can/gw.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-)
--- a/net/can/gw.c +++ b/net/can/gw.c @@ -418,13 +418,29 @@ static void can_can_gw_rcv(struct sk_buf while (modidx < MAX_MODFUNCTIONS && gwj->mod.modfunc[modidx]) (*gwj->mod.modfunc[modidx++])(cf, &gwj->mod);
- /* check for checksum updates when the CAN frame has been modified */ + /* Has the CAN frame been modified? */ if (modidx) { - if (gwj->mod.csumfunc.crc8) + /* get available space for the processed CAN frame type */ + int max_len = nskb->len - offsetof(struct can_frame, data); + + /* dlc may have changed, make sure it fits to the CAN frame */ + if (cf->can_dlc > max_len) + goto out_delete; + + /* check for checksum updates in classic CAN length only */ + if (gwj->mod.csumfunc.crc8) { + if (cf->can_dlc > 8) + goto out_delete; + (*gwj->mod.csumfunc.crc8)(cf, &gwj->mod.csum.crc8); + } + + if (gwj->mod.csumfunc.xor) { + if (cf->can_dlc > 8) + goto out_delete;
- if (gwj->mod.csumfunc.xor) (*gwj->mod.csumfunc.xor)(cf, &gwj->mod.csum.xor); + } }
/* clear the skb timestamp if not configured the other way */ @@ -436,6 +452,14 @@ static void can_can_gw_rcv(struct sk_buf gwj->dropped_frames++; else gwj->handled_frames++; + + return; + + out_delete: + /* delete frame due to misconfiguration */ + gwj->deleted_frames++; + kfree_skb(nskb); + return; }
static inline int cgw_register_filter(struct cgw_job *gwj)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chao Yu yuchao0@huawei.com
commit d47b8715953ad0cf82bb0a9d30d7b11d83bc9c11 upstream.
i_times of inode will be set with current system time which can be configured through 'date', so it's not safe to judge dnode block as garbage data or unchanged inode depend on i_times.
Now, we have used enhanced 'cp_ver + cp' crc method to verify valid dnode block, so I expect recoverying invalid dnode is almost not possible.
This reverts commit 807b1e1c8e08452948495b1a9985ab46d329e5c2.
Signed-off-by: Chao Yu yuchao0@huawei.com Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Cc: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/f2fs/recovery.c | 31 +------------------------------ 1 file changed, 1 insertion(+), 30 deletions(-)
--- a/fs/f2fs/recovery.c +++ b/fs/f2fs/recovery.c @@ -196,32 +196,6 @@ static void recover_inode(struct inode * ino_of_node(page), name); }
-static bool is_same_inode(struct inode *inode, struct page *ipage) -{ - struct f2fs_inode *ri = F2FS_INODE(ipage); - struct timespec disk; - - if (!IS_INODE(ipage)) - return true; - - disk.tv_sec = le64_to_cpu(ri->i_ctime); - disk.tv_nsec = le32_to_cpu(ri->i_ctime_nsec); - if (timespec_compare(&inode->i_ctime, &disk) > 0) - return false; - - disk.tv_sec = le64_to_cpu(ri->i_atime); - disk.tv_nsec = le32_to_cpu(ri->i_atime_nsec); - if (timespec_compare(&inode->i_atime, &disk) > 0) - return false; - - disk.tv_sec = le64_to_cpu(ri->i_mtime); - disk.tv_nsec = le32_to_cpu(ri->i_mtime_nsec); - if (timespec_compare(&inode->i_mtime, &disk) > 0) - return false; - - return true; -} - static int find_fsync_dnodes(struct f2fs_sb_info *sbi, struct list_head *head) { struct curseg_info *curseg; @@ -248,10 +222,7 @@ static int find_fsync_dnodes(struct f2fs goto next;
entry = get_fsync_inode(head, ino_of_node(page)); - if (entry) { - if (!is_same_inode(entry->inode, page)) - goto next; - } else { + if (!entry) { if (IS_INODE(page) && is_dent_dnode(page)) { err = recover_inode_page(sbi, page); if (err)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ben Hutchings ben@decadent.org.uk
The backport of commit afeaade90db4 "media: em28xx: make v4l2-compliance happier by starting sequence on zero" added a reset on em28xx_v4l2::field_count to em28xx_ctrl_notify(), but it should be done in em28xx_start_analog_streaming().
Signed-off-by: Ben Hutchings ben@decadent.org.uk Cc: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/usb/em28xx/em28xx-video.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/media/usb/em28xx/em28xx-video.c +++ b/drivers/media/usb/em28xx/em28xx-video.c @@ -1062,6 +1062,8 @@ int em28xx_start_analog_streaming(struct
em28xx_videodbg("%s\n", __func__);
+ dev->v4l2->field_count = 0; + /* Make sure streaming is not already in progress for this type of filehandle (e.g. video, vbi) */ rc = res_get(dev, vq->type); @@ -1290,8 +1292,6 @@ static void em28xx_ctrl_notify(struct v4 { struct em28xx *dev = priv;
- dev->v4l2->field_count = 0; - /* * In the case of non-AC97 volume controls, we still need * to do some setups at em28xx, in order to mute/unmute
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
If CONFIG_SECCOMP=n, /proc/self/status includes an empty line. This causes the iotop application to bail out with an error message.
File "/usr/local/lib64/python2.7/site-packages/iotop/data.py", line 196, in parse_proc_pid_status key, value = line.split(':\t', 1) ValueError: need more than 1 value to unpack
The problem is seen in v4.9.y but not upstream because commit af884cd4a5ae6 ("proc: report no_new_privs state") has not been backported to v4.9.y. The backport of commit fae1fa0fc6cc ("proc: Provide details on speculation flaw mitigations") tried to address the resulting differences but was wrong, introducing the problem.
Fixes: 51ef9af2a35b ("proc: Provide details on speculation flaw mitigations") Cc: Kees Cook keescook@chromium.org Cc: Gwendal Grignou gwendal@chromium.org Signed-off-by: Guenter Roeck linux@roeck-us.net Acked-by: Kees Cook keescook@chromium.org --- This patch only applies to v4.9.y. v4.4.y also needs to be fixed (see https://www.spinics.net/lists/stable/msg279131.html), but the fix is slightly different. v4.14.y and later are not affected.
fs/proc/array.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -346,8 +346,9 @@ static inline void task_seccomp(struct s { #ifdef CONFIG_SECCOMP seq_put_decimal_ull(m, "Seccomp:\t", p->seccomp.mode); + seq_putc(m, '\n'); #endif - seq_printf(m, "\nSpeculation_Store_Bypass:\t"); + seq_printf(m, "Speculation_Store_Bypass:\t"); switch (arch_prctl_spec_ctrl_get(p, PR_SPEC_STORE_BYPASS)) { case -EINVAL: seq_printf(m, "unknown");
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
This reverts commit 8323aafe67b31c7f73d18747604ba1cc6c3e4f3a.
A wrong commit message was used for the stable commit because of a human error (and duplicate commit subject lines).
This patch reverts this error, and the following patches add the two upstream commits.
Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c index 8652475e01d0..2fb1bf1a26c5 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -631,11 +631,8 @@ static void cxgbit_send_halfclose(struct cxgbit_sock *csk)
static void cxgbit_arp_failure_discard(void *handle, struct sk_buff *skb) { - struct cxgbit_sock *csk = handle; - pr_debug("%s cxgbit_device %p\n", __func__, handle); kfree_skb(skb); - cxgbit_put_csk(csk); }
static void cxgbit_abort_arp_failure(void *handle, struct sk_buff *skb) @@ -1139,7 +1136,7 @@ cxgbit_pass_accept_rpl(struct cxgbit_sock *csk, struct cpl_pass_accept_req *req) rpl5->opt0 = cpu_to_be64(opt0); rpl5->opt2 = cpu_to_be32(opt2); set_wr_txq(skb, CPL_PRIORITY_SETUP, csk->ctrlq_idx); - t4_set_arp_err_handler(skb, csk, cxgbit_arp_failure_discard); + t4_set_arp_err_handler(skb, NULL, cxgbit_arp_failure_discard); cxgbit_l2t_send(csk->com.cdev, skb, csk->l2t); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit ed076c55b359cc9982ca8b065bcc01675f7365f6 ]
In case of arp failure call cxgbit_put_csk() to free csk.
Signed-off-by: Varun Prakash varun@chelsio.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c index 2fb1bf1a26c5..8652475e01d0 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -631,8 +631,11 @@ static void cxgbit_send_halfclose(struct cxgbit_sock *csk)
static void cxgbit_arp_failure_discard(void *handle, struct sk_buff *skb) { + struct cxgbit_sock *csk = handle; + pr_debug("%s cxgbit_device %p\n", __func__, handle); kfree_skb(skb); + cxgbit_put_csk(csk); }
static void cxgbit_abort_arp_failure(void *handle, struct sk_buff *skb) @@ -1136,7 +1139,7 @@ cxgbit_pass_accept_rpl(struct cxgbit_sock *csk, struct cpl_pass_accept_req *req) rpl5->opt0 = cpu_to_be64(opt0); rpl5->opt2 = cpu_to_be32(opt2); set_wr_txq(skb, CPL_PRIORITY_SETUP, csk->ctrlq_idx); - t4_set_arp_err_handler(skb, NULL, cxgbit_arp_failure_discard); + t4_set_arp_err_handler(skb, csk, cxgbit_arp_failure_discard); cxgbit_l2t_send(csk->com.cdev, skb, csk->l2t); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Backport of upstream commit 4eaed6aa2c628101246bcabc91b203bfac1193f8 ]
In KVM we define the configuration of HCR_EL2 for a VHE HOST in HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the non-VHE host flags, and open-code HCR_RW. Further, in head.S we open-code the flags for VHE and non-VHE configurations.
In future, we're going to want to configure more flags for the host, so lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S.
We now use mov_q to generate the HCR_EL2 value, as we use when configuring other registers in head.S.
Reviewed-by: Marc Zyngier marc.zyngier@arm.com Reviewed-by: Richard Henderson richard.henderson@linaro.org Signed-off-by: Mark Rutland mark.rutland@arm.com Reviewed-by: Christoffer Dall christoffer.dall@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Cc: Will Deacon will.deacon@arm.com Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon will.deacon@arm.com [kristina: backport to 4.9.y: adjust context] Signed-off-by: Kristina Martsenko kristina.martsenko@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kernel/head.S | 5 ++--- arch/arm64/kvm/hyp/switch.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 68dedca5a47e..352bf2f7f60a 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -82,6 +82,7 @@ HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) #define HCR_INT_OVERRIDE (HCR_FMO | HCR_IMO) +#define HCR_HOST_NVHE_FLAGS (HCR_RW) #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
/* TCR_EL2 Registers bits */ diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index fa52817d84c5..3289d1458791 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -517,10 +517,9 @@ CPU_LE( bic x0, x0, #(3 << 24) ) // Clear the EE and E0E bits for EL1 #endif
/* Hyp configuration. */ - mov x0, #HCR_RW // 64-bit EL1 + mov_q x0, HCR_HOST_NVHE_FLAGS cbz x2, set_hcr - orr x0, x0, #HCR_TGE // Enable Host Extensions - orr x0, x0, #HCR_E2H + mov_q x0, HCR_HOST_VHE_FLAGS set_hcr: msr hcr_el2, x0 isb diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 12f9d1ecdf4c..115b0955715f 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -112,7 +112,7 @@ static void __hyp_text __deactivate_traps_vhe(void)
static void __hyp_text __deactivate_traps_nvhe(void) { - write_sysreg(HCR_RW, hcr_el2); + write_sysreg(HCR_HOST_NVHE_FLAGS, hcr_el2); write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
[ Backport of upstream commit b3669b1e1c09890d61109a1a8ece2c5b66804714 ]
To allow EL0 (and/or EL1) to use pointer authentication functionality, we must ensure that pointer authentication instructions and accesses to pointer authentication keys are not trapped to EL2.
This patch ensures that HCR_EL2 is configured appropriately when the kernel is booted at EL2. For non-VHE kernels we set HCR_EL2.{API,APK}, ensuring that EL1 can access keys and permit EL0 use of instructions. For VHE kernels host EL0 (TGE && E2H) is unaffected by these settings, and it doesn't matter how we configure HCR_EL2.{API,APK}, so we don't bother setting them.
This does not enable support for KVM guests, since KVM manages HCR_EL2 itself when running VMs.
Reviewed-by: Richard Henderson richard.henderson@linaro.org Signed-off-by: Mark Rutland mark.rutland@arm.com Acked-by: Christoffer Dall christoffer.dall@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Marc Zyngier marc.zyngier@arm.com Cc: Will Deacon will.deacon@arm.com Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Will Deacon will.deacon@arm.com [kristina: backport to 4.9.y: adjust context] Signed-off-by: Kristina Martsenko kristina.martsenko@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_arm.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 352bf2f7f60a..a11c8c2915c9 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -23,6 +23,8 @@ #include <asm/types.h>
/* Hyp Configuration Register (HCR) bits */ +#define HCR_API (UL(1) << 41) +#define HCR_APK (UL(1) << 40) #define HCR_E2H (UL(1) << 34) #define HCR_ID (UL(1) << 33) #define HCR_CD (UL(1) << 32) @@ -82,7 +84,7 @@ HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) #define HCR_INT_OVERRIDE (HCR_FMO | HCR_IMO) -#define HCR_HOST_NVHE_FLAGS (HCR_RW) +#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK) #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
/* TCR_EL2 Registers bits */
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 7d033c9f6a7fd3821af75620a0257db87c2b552a ]
This patch makes sure the flow label in the IPv6 header forged in ipv6_local_error() is initialized.
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675 kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601 _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32 copy_to_user include/linux/uaccess.h:177 [inline] move_addr_to_user+0x2e9/0x4f0 net/socket.c:227 ___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284 __sys_recvmsg net/socket.c:2327 [inline] __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x457ec9 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9 RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4 R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffff
Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_save_stack mm/kmsan/kmsan.c:219 [inline] kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439 __msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200 ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475 udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335 inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830 sock_recvmsg_nosec net/socket.c:794 [inline] sock_recvmsg+0x1d1/0x230 net/socket.c:801 ___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278 __sys_recvmsg net/socket.c:2327 [inline] __do_sys_recvmsg net/socket.c:2337 [inline] __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2759 [inline] __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383 __kmalloc_reserve net/core/skbuff.c:137 [inline] __alloc_skb+0x309/0xa20 net/core/skbuff.c:205 alloc_skb include/linux/skbuff.h:998 [inline] ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334 __ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311 ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775 udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384 inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] __sys_sendto+0x8c4/0xac0 net/socket.c:1788 __do_sys_sendto net/socket.c:1800 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:1796 __x64_sys_sendto+0x6e/0x90 net/socket.c:1796 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
Bytes 4-7 of 28 are uninitialized Memory access of size 28 starts at ffff8881937bfce0 Data copied to user address 0000000020000000
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/datagram.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -335,6 +335,7 @@ void ipv6_local_error(struct sock *sk, i skb_reset_network_header(skb); iph = ipv6_hdr(skb); iph->daddr = fl6->daddr; + ip6_flow_hdr(iph, 0, 0);
serr = SKB_EXT_ERR(skb); serr->ee.ee_errno = err;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: JianJhen Chen kchen@synology.com
[ Upstream commit 4c84edc11b76590859b1e45dd676074c59602dc4 ]
When handling DNAT'ed packets on a bridge device, the neighbour cache entry from lookup was used without checking its state. It means that a cache entry in the NUD_STALE state will be used directly instead of entering the NUD_DELAY state to confirm the reachability of the neighbor.
This problem becomes worse after commit 2724680bceee ("neigh: Keep neighbour cache entries if number of them is small enough."), since all neighbour cache entries in the NUD_STALE state will be kept in the neighbour table as long as the number of cache entries does not exceed the value specified in gc_thresh1.
This commit validates the state of a neighbour cache entry before using the entry.
Signed-off-by: JianJhen Chen kchen@synology.com Reviewed-by: JinLin Chen jlchen@synology.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bridge/br_netfilter_hooks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -275,7 +275,7 @@ int br_nf_pre_routing_finish_bridge(stru struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb); int ret;
- if (neigh->hh.hh_len) { + if ((neigh->nud_state & NUD_CONNECTED) && neigh->hh.hh_len) { neigh_hh_bridge(&neigh->hh, skb); skb->dev = nf_bridge->physindev; ret = br_handle_frame_finish(net, sk, skb);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Gunthorpe jgg@mellanox.com
[ Upstream commit d972f3dce8d161e2142da0ab1ef25df00e2f21a9 ]
'dev' is non NULL when the addr_len check triggers so it must goto a label that does the dev_put otherwise dev will have a leaked refcount.
This bug causes the ib_ipoib module to become unloadable when using systemd-network as it triggers this check on InfiniBand links.
Fixes: 99137b7888f4 ("packet: validate address length") Reported-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/packet/af_packet.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2663,7 +2663,7 @@ static int tpacket_snd(struct packet_soc addr = saddr->sll_halen ? saddr->sll_addr : NULL; dev = dev_get_by_index(sock_net(&po->sk), saddr->sll_ifindex); if (addr && dev && saddr->sll_halen < dev->addr_len) - goto out; + goto out_put; }
err = -ENXIO; @@ -2862,7 +2862,7 @@ static int packet_snd(struct socket *soc addr = saddr->sll_halen ? saddr->sll_addr : NULL; dev = dev_get_by_index(sock_net(sk), saddr->sll_ifindex); if (addr && dev && saddr->sll_halen < dev->addr_len) - goto out; + goto out_unlock; }
err = -ENXIO;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit 001e465f09a18857443489a57e74314a3368c805 ]
A network device stack with multiple layers of bonding devices can trigger a false positive lockdep warning. Adding lockdep nest levels fixes this. Update the level on both enslave and unlink, to avoid the following series of events ..
ip netns add test ip netns exec test bash ip link set dev lo addr 00:11:22:33:44:55 ip link set dev lo down
ip link add dev bond1 type bond ip link add dev bond2 type bond
ip link set dev lo master bond1 ip link set dev bond1 master bond2
ip link set dev bond1 nomaster ip link set dev bond2 master bond1
.. from still generating a splat:
[ 193.652127] ====================================================== [ 193.658231] WARNING: possible circular locking dependency detected [ 193.664350] 4.20.0 #8 Not tainted [ 193.668310] ------------------------------------------------------ [ 193.674417] ip/15577 is trying to acquire lock: [ 193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290 [ 193.687851] but task is already holding lock: [ 193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290
[..]
[ 193.851092] lock_acquire+0xa7/0x190 [ 193.855138] _raw_spin_lock_nested+0x2d/0x40 [ 193.859878] bond_get_stats+0x58/0x290 [ 193.864093] dev_get_stats+0x5a/0xc0 [ 193.868140] bond_get_stats+0x105/0x290 [ 193.872444] dev_get_stats+0x5a/0xc0 [ 193.876493] rtnl_fill_stats+0x40/0x130 [ 193.880797] rtnl_fill_ifinfo+0x6c5/0xdc0 [ 193.885271] rtmsg_ifinfo_build_skb+0x86/0xe0 [ 193.890091] rtnetlink_event+0x5b/0xa0 [ 193.894320] raw_notifier_call_chain+0x43/0x60 [ 193.899225] netdev_change_features+0x50/0xa0 [ 193.904044] bond_compute_features.isra.46+0x1ab/0x270 [ 193.909640] bond_enslave+0x141d/0x15b0 [ 193.913946] do_set_master+0x89/0xa0 [ 193.918016] do_setlink+0x37c/0xda0 [ 193.921980] __rtnl_newlink+0x499/0x890 [ 193.926281] rtnl_newlink+0x48/0x70 [ 193.930238] rtnetlink_rcv_msg+0x171/0x4b0 [ 193.934801] netlink_rcv_skb+0xd1/0x110 [ 193.939103] rtnetlink_rcv+0x15/0x20 [ 193.943151] netlink_unicast+0x3b5/0x520 [ 193.947544] netlink_sendmsg+0x2fd/0x3f0 [ 193.951942] sock_sendmsg+0x38/0x50 [ 193.955899] ___sys_sendmsg+0x2ba/0x2d0 [ 193.960205] __x64_sys_sendmsg+0xad/0x100 [ 193.964687] do_syscall_64+0x5a/0x460 [ 193.968823] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: 7e2556e40026 ("bonding: avoid lockdep confusion in bond_get_stats()") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1900,6 +1900,9 @@ static int __bond_release_one(struct net if (!bond_has_slaves(bond)) { bond_set_carrier(bond); eth_hw_addr_random(bond_dev); + bond->nest_level = SINGLE_DEPTH_NESTING; + } else { + bond->nest_level = dev_get_nest_level(bond_dev) + 1; }
unblock_netpoll_tx();
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit 4a06fa67c4da20148803525151845276cdb995c1 ]
Commit 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") avoided a read beyond the end of the skb linear segment by calling pskb_may_pull.
That function can trigger a BUG_ON in pskb_expand_head if the skb is shared, which it is when when peeking. It can also return ENOMEM.
Avoid both by switching to safer skb_header_pointer.
Fixes: 2efd4fca703a ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") Reported-by: syzbot syzkaller@googlegroups.com Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/ip_sockglue.c | 12 +++++------- net/ipv6/datagram.c | 10 ++++------ 2 files changed, 9 insertions(+), 13 deletions(-)
--- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -133,19 +133,17 @@ static void ip_cmsg_recv_security(struct
static void ip_cmsg_recv_dstaddr(struct msghdr *msg, struct sk_buff *skb) { + __be16 _ports[2], *ports; struct sockaddr_in sin; - __be16 *ports; - int end; - - end = skb_transport_offset(skb) + 4; - if (end > 0 && !pskb_may_pull(skb, end)) - return;
/* All current transport protocols have the port numbers in the * first four bytes of the transport header and this function is * written with this assumption in mind. */ - ports = (__be16 *)skb_transport_header(skb); + ports = skb_header_pointer(skb, skb_transport_offset(skb), + sizeof(_ports), &_ports); + if (!ports) + return;
sin.sin_family = AF_INET; sin.sin_addr.s_addr = ip_hdr(skb)->daddr; --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -695,17 +695,15 @@ void ip6_datagram_recv_specific_ctl(stru } if (np->rxopt.bits.rxorigdstaddr) { struct sockaddr_in6 sin6; - __be16 *ports; - int end; + __be16 _ports[2], *ports;
- end = skb_transport_offset(skb) + 4; - if (end <= 0 || pskb_may_pull(skb, end)) { + ports = skb_header_pointer(skb, skb_transport_offset(skb), + sizeof(_ports), &_ports); + if (ports) { /* All current transport protocols have the port numbers in the * first four bytes of the transport header and this function is * written with this assumption in mind. */ - ports = (__be16 *)skb_transport_header(skb); - sin6.sin6_family = AF_INET6; sin6.sin6_addr = ipv6_hdr(skb)->daddr; sin6.sin6_port = ports[1];
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aymen Sghaier aymen.sghaier@nxp.com
commit 04e6d25c5bb244c1a37eb9fe0b604cc11a04e8c5 upstream.
Recent changes - probably DMA API related (generic and/or arm64-specific) - exposed a case where driver maps a zero-length buffer: ahash_init()->ahash_update()->ahash_final() with a zero-length string to hash
kernel BUG at kernel/dma/swiotlb.c:475! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 2 PID: 1823 Comm: cryptomgr_test Not tainted 4.20.0-rc1-00108-g00c9fe37a7f2 #1 Hardware name: LS1046A RDB Board (DT) pstate: 80000005 (Nzcv daif -PAN -UAO) pc : swiotlb_tbl_map_single+0x170/0x2b8 lr : swiotlb_map_page+0x134/0x1f8 sp : ffff00000f79b8f0 x29: ffff00000f79b8f0 x28: 0000000000000000 x27: ffff0000093d0000 x26: 0000000000000000 x25: 00000000001f3ffe x24: 0000000000200000 x23: 0000000000000000 x22: 00000009f2c538c0 x21: ffff800970aeb410 x20: 0000000000000001 x19: ffff800970aeb410 x18: 0000000000000007 x17: 000000000000000e x16: 0000000000000001 x15: 0000000000000019 x14: c32cb8218a167fe8 x13: ffffffff00000000 x12: ffff80097fdae348 x11: 0000800976bca000 x10: 0000000000000010 x9 : 0000000000000000 x8 : ffff0000091fd6c8 x7 : 0000000000000000 x6 : 00000009f2c538bf x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 00000009f2c538c0 x1 : 00000000f9fff000 x0 : 0000000000000000 Process cryptomgr_test (pid: 1823, stack limit = 0x(____ptrval____)) Call trace: swiotlb_tbl_map_single+0x170/0x2b8 swiotlb_map_page+0x134/0x1f8 ahash_final_no_ctx+0xc4/0x6cc ahash_final+0x10/0x18 crypto_ahash_op+0x30/0x84 crypto_ahash_final+0x14/0x1c __test_hash+0x574/0xe0c test_hash+0x28/0x80 __alg_test_hash+0x84/0xd0 alg_test_hash+0x78/0x144 alg_test.part.30+0x12c/0x2b4 alg_test+0x3c/0x68 cryptomgr_test+0x44/0x4c kthread+0xfc/0x128 ret_from_fork+0x10/0x18 Code: d34bfc18 2a1a03f7 1a9f8694 35fff89a (d4210000)
Cc: stable@vger.kernel.org Signed-off-by: Aymen Sghaier aymen.sghaier@nxp.com Signed-off-by: Horia Geantă horia.geanta@nxp.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/caam/caamhash.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/crypto/caam/caamhash.c +++ b/drivers/crypto/caam/caamhash.c @@ -1232,13 +1232,16 @@ static int ahash_final_no_ctx(struct aha
desc = edesc->hw_desc;
- state->buf_dma = dma_map_single(jrdev, buf, buflen, DMA_TO_DEVICE); - if (dma_mapping_error(jrdev, state->buf_dma)) { - dev_err(jrdev, "unable to map src\n"); - goto unmap; - } + if (buflen) { + state->buf_dma = dma_map_single(jrdev, buf, buflen, + DMA_TO_DEVICE); + if (dma_mapping_error(jrdev, state->buf_dma)) { + dev_err(jrdev, "unable to map src\n"); + goto unmap; + }
- append_seq_in_ptr(desc, state->buf_dma, buflen, 0); + append_seq_in_ptr(desc, state->buf_dma, buflen, 0); + }
edesc->dst_dma = map_seq_out_ptr_result(desc, jrdev, req->result, digestsize);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harsh Jain harsh@chelsio.com
commit a7773363624b034ab198c738661253d20a8055c2 upstream.
Authencesn template in decrypt path unconditionally calls aead_request_complete after ahash_verify which leads to following kernel panic in after decryption.
[ 338.539800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 338.548372] PGD 0 P4D 0 [ 338.551157] Oops: 0000 [#1] SMP PTI [ 338.554919] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G W I 4.19.7+ #13 [ 338.564431] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0 07/29/10 [ 338.572212] RIP: 0010:esp_input_done2+0x350/0x410 [esp4] [ 338.578030] Code: ff 0f b6 68 10 48 8b 83 c8 00 00 00 e9 8e fe ff ff 8b 04 25 04 00 00 00 83 e8 01 48 98 48 8b 3c c5 10 00 00 00 e9 f7 fd ff ff <8b> 04 25 04 00 00 00 83 e8 01 48 98 4c 8b 24 c5 10 00 00 00 e9 3b [ 338.598547] RSP: 0018:ffff911c97803c00 EFLAGS: 00010246 [ 338.604268] RAX: 0000000000000002 RBX: ffff911c4469ee00 RCX: 0000000000000000 [ 338.612090] RDX: 0000000000000000 RSI: 0000000000000130 RDI: ffff911b87c20400 [ 338.619874] RBP: 0000000000000000 R08: ffff911b87c20498 R09: 000000000000000a [ 338.627610] R10: 0000000000000001 R11: 0000000000000004 R12: 0000000000000000 [ 338.635402] R13: ffff911c89590000 R14: ffff911c91730000 R15: 0000000000000000 [ 338.643234] FS: 0000000000000000(0000) GS:ffff911c97800000(0000) knlGS:0000000000000000 [ 338.652047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 338.658299] CR2: 0000000000000004 CR3: 00000001ec20a000 CR4: 00000000000006f0 [ 338.666382] Call Trace: [ 338.669051] <IRQ> [ 338.671254] esp_input_done+0x12/0x20 [esp4] [ 338.675922] chcr_handle_resp+0x3b5/0x790 [chcr] [ 338.680949] cpl_fw6_pld_handler+0x37/0x60 [chcr] [ 338.686080] chcr_uld_rx_handler+0x22/0x50 [chcr] [ 338.691233] uldrx_handler+0x8c/0xc0 [cxgb4] [ 338.695923] process_responses+0x2f0/0x5d0 [cxgb4] [ 338.701177] ? bitmap_find_next_zero_area_off+0x3a/0x90 [ 338.706882] ? matrix_alloc_area.constprop.7+0x60/0x90 [ 338.712517] ? apic_update_irq_cfg+0x82/0xf0 [ 338.717177] napi_rx_handler+0x14/0xe0 [cxgb4] [ 338.722015] net_rx_action+0x2aa/0x3e0 [ 338.726136] __do_softirq+0xcb/0x280 [ 338.730054] irq_exit+0xde/0xf0 [ 338.733504] do_IRQ+0x54/0xd0 [ 338.736745] common_interrupt+0xf/0xf
Fixes: 104880a6b470 ("crypto: authencesn - Convert to new AEAD...") Signed-off-by: Harsh Jain harsh@chelsio.com Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/authencesn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/crypto/authencesn.c +++ b/crypto/authencesn.c @@ -279,7 +279,7 @@ static void authenc_esn_verify_ahash_don struct aead_request *req = areq->data;
err = err ?: crypto_authenc_esn_decrypt_tail(req, 0); - aead_request_complete(req, err); + authenc_esn_request_complete(req, err); }
static int crypto_authenc_esn_decrypt(struct aead_request *req)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit 8f9c469348487844328e162db57112f7d347c49f upstream.
Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte 'enckeylen', followed by an authentication key and an encryption key. crypto_authenc_extractkeys() parses the key to find the inner keys.
However, it fails to consider the case where the rtattr's payload is longer than 4 bytes but not 4-byte aligned, and where the key ends before the next 4-byte aligned boundary. In this case, 'keylen -= RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX. This causes a buffer overread and crash during crypto_ahash_setkey().
Fix it by restricting the rtattr payload to the expected size.
Reproducer using AF_ALG:
#include <linux/if_alg.h> #include <linux/rtnetlink.h> #include <sys/socket.h>
int main() { int fd; struct sockaddr_alg addr = { .salg_type = "aead", .salg_name = "authenc(hmac(sha256),cbc(aes))", }; struct { struct rtattr attr; __be32 enckeylen; char keys[1]; } __attribute__((packed)) key = { .attr.rta_len = sizeof(key), .attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */, };
fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key)); }
It caused:
BUG: unable to handle kernel paging request at ffff88007ffdc000 PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0 Oops: 0000 [#1] SMP CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155 [...] Call Trace: sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321 crypto_shash_finup+0x1a/0x30 crypto/shash.c:178 shash_digest_unaligned+0x45/0x60 crypto/shash.c:186 crypto_shash_digest+0x24/0x40 crypto/shash.c:202 hmac_setkey+0x135/0x1e0 crypto/hmac.c:66 crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66 shash_async_setkey+0x10/0x20 crypto/shash.c:223 crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202 crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96 crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62 aead_setkey+0xc/0x10 crypto/algif_aead.c:526 alg_setkey crypto/af_alg.c:223 [inline] alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256 __sys_setsockopt+0x6d/0xd0 net/socket.c:1902 __do_sys_setsockopt net/socket.c:1913 [inline] __se_sys_setsockopt net/socket.c:1910 [inline] __x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910 do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: e236d4a89a2f ("[CRYPTO] authenc: Move enckeylen into key itself") Cc: stable@vger.kernel.org # v2.6.25+ Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- crypto/authenc.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/crypto/authenc.c +++ b/crypto/authenc.c @@ -58,14 +58,22 @@ int crypto_authenc_extractkeys(struct cr return -EINVAL; if (rta->rta_type != CRYPTO_AUTHENC_KEYA_PARAM) return -EINVAL; - if (RTA_PAYLOAD(rta) < sizeof(*param)) + + /* + * RTA_OK() didn't align the rtattr's payload when validating that it + * fits in the buffer. Yet, the keys should start on the next 4-byte + * aligned boundary. To avoid confusion, require that the rtattr + * payload be exactly the param struct, which has a 4-byte aligned size. + */ + if (RTA_PAYLOAD(rta) != sizeof(*param)) return -EINVAL; + BUILD_BUG_ON(sizeof(*param) % RTA_ALIGNTO);
param = RTA_DATA(rta); keys->enckeylen = be32_to_cpu(param->enckeylen);
- key += RTA_ALIGN(rta->rta_len); - keylen -= RTA_ALIGN(rta->rta_len); + key += rta->rta_len; + keylen -= rta->rta_len;
if (keylen < keys->enckeylen) return -EINVAL;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 74d5d229b1bf60f93bff244b2dfc0eb21ec32a07 upstream.
If we flip read-only before we initiate writeback on all dirty pages for ordered extents we've created then we'll have ordered extents left over on umount, which results in all sorts of bad things happening. Fix this by making sure we wait on ordered extents if we have to do the aborted transaction cleanup stuff.
generic/475 can produce this warning:
[ 8531.177332] WARNING: CPU: 2 PID: 11997 at fs/btrfs/disk-io.c:3856 btrfs_free_fs_root+0x95/0xa0 [btrfs] [ 8531.183282] CPU: 2 PID: 11997 Comm: umount Tainted: G W 5.0.0-rc1-default+ #394 [ 8531.185164] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014 [ 8531.187851] RIP: 0010:btrfs_free_fs_root+0x95/0xa0 [btrfs] [ 8531.193082] RSP: 0018:ffffb1ab86163d98 EFLAGS: 00010286 [ 8531.194198] RAX: ffff9f3449494d18 RBX: ffff9f34a2695000 RCX:0000000000000000 [ 8531.195629] RDX: 0000000000000002 RSI: 0000000000000001 RDI:0000000000000000 [ 8531.197315] RBP: ffff9f344e930000 R08: 0000000000000001 R09:0000000000000000 [ 8531.199095] R10: 0000000000000000 R11: ffff9f34494d4ff8 R12:ffffb1ab86163dc0 [ 8531.200870] R13: ffff9f344e9300b0 R14: ffffb1ab86163db8 R15:0000000000000000 [ 8531.202707] FS: 00007fc68e949fc0(0000) GS:ffff9f34bd800000(0000)knlGS:0000000000000000 [ 8531.204851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8531.205942] CR2: 00007ffde8114dd8 CR3: 000000002dfbd000 CR4:00000000000006e0 [ 8531.207516] Call Trace: [ 8531.208175] btrfs_free_fs_roots+0xdb/0x170 [btrfs] [ 8531.210209] ? wait_for_completion+0x5b/0x190 [ 8531.211303] close_ctree+0x157/0x350 [btrfs] [ 8531.212412] generic_shutdown_super+0x64/0x100 [ 8531.213485] kill_anon_super+0x14/0x30 [ 8531.214430] btrfs_kill_super+0x12/0xa0 [btrfs] [ 8531.215539] deactivate_locked_super+0x29/0x60 [ 8531.216633] cleanup_mnt+0x3b/0x70 [ 8531.217497] task_work_run+0x98/0xc0 [ 8531.218397] exit_to_usermode_loop+0x83/0x90 [ 8531.219324] do_syscall_64+0x15b/0x180 [ 8531.220192] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 8531.221286] RIP: 0033:0x7fc68e5e4d07 [ 8531.225621] RSP: 002b:00007ffde8116608 EFLAGS: 00000246 ORIG_RAX:00000000000000a6 [ 8531.227512] RAX: 0000000000000000 RBX: 00005580c2175970 RCX:00007fc68e5e4d07 [ 8531.229098] RDX: 0000000000000001 RSI: 0000000000000000 RDI:00005580c2175b80 [ 8531.230730] RBP: 0000000000000000 R08: 00005580c2175ba0 R09:00007ffde8114e80 [ 8531.232269] R10: 0000000000000000 R11: 0000000000000246 R12:00005580c2175b80 [ 8531.233839] R13: 00007fc68eac61c4 R14: 00005580c2175a68 R15:0000000000000000
Leaving a tree in the rb-tree:
3853 void btrfs_free_fs_root(struct btrfs_root *root) 3854 { 3855 iput(root->ino_cache_inode); 3856 WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
CC: stable@vger.kernel.org Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Josef Bacik josef@toxicpanda.com [ add stacktrace ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/disk-io.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4193,6 +4193,14 @@ static void btrfs_destroy_all_ordered_ex spin_lock(&fs_info->ordered_root_lock); } spin_unlock(&fs_info->ordered_root_lock); + + /* + * We need this here because if we've been flipped read-only we won't + * get sync() from the umount, so we need to make sure any ordered + * extents that haven't had their dirty pages IO start writeout yet + * actually get run and error out properly. + */ + btrfs_wait_ordered_roots(fs_info, -1, 0, (u64)-1); }
static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
commit 9474f4e7cd71a633fa1ef93b7daefd44bbdfd482 upstream.
It's possible that a pid has died before we take the rcu lock, in which case we can't walk the ancestry list as it may be detached. Instead, check for death first before doing the walk.
Reported-by: syzbot+a9ac39bf55329e206219@syzkaller.appspotmail.com Fixes: 2d514487faf1 ("security: Yama LSM") Cc: stable@vger.kernel.org Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: James Morris james.morris@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/yama/yama_lsm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/security/yama/yama_lsm.c +++ b/security/yama/yama_lsm.c @@ -359,7 +359,9 @@ static int yama_ptrace_access_check(stru break; case YAMA_SCOPE_RELATIONAL: rcu_read_lock(); - if (!task_is_descendant(current, child) && + if (!pid_alive(child)) + rc = -EPERM; + if (!rc && !task_is_descendant(current, child) && !ptracer_exception_found(current, child) && !ns_capable(__task_cred(child)->user_ns, CAP_SYS_PTRACE)) rc = -EPERM;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanley Chu stanley.chu@mediatek.com
commit 3f7e62bba0003f9c68f599f5997c4647ef5b4f4e upstream.
The commit 356fd2663cff ("scsi: Set request queue runtime PM status back to active on resume") fixed up the inconsistent RPM status between request queue and device. However changing request queue RPM status shall be done only on successful resume, otherwise status may be still inconsistent as below,
Request queue: RPM_ACTIVE Device: RPM_SUSPENDED
This ends up soft lockup because requests can be submitted to underlying devices but those devices and their required resource are not resumed.
For example,
After above inconsistent status happens, IO request can be submitted to UFS device driver but required resource (like clock) is not resumed yet thus lead to warning as below call stack,
WARN_ON(hba->clk_gating.state != CLKS_ON); ufshcd_queuecommand scsi_dispatch_cmd scsi_request_fn __blk_run_queue cfq_insert_request __elv_add_request blk_flush_plug_list blk_finish_plug jbd2_journal_commit_transaction kjournald2
We may see all behind IO requests hang because of no response from storage host or device and then soft lockup happens in system. In the end, system may crash in many ways.
Fixes: 356fd2663cff (scsi: Set request queue runtime PM status back to active on resume) Cc: stable@vger.kernel.org Signed-off-by: Stanley Chu stanley.chu@mediatek.com Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/scsi_pm.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
--- a/drivers/scsi/scsi_pm.c +++ b/drivers/scsi/scsi_pm.c @@ -79,8 +79,22 @@ static int scsi_dev_type_resume(struct d
if (err == 0) { pm_runtime_disable(dev); - pm_runtime_set_active(dev); + err = pm_runtime_set_active(dev); pm_runtime_enable(dev); + + /* + * Forcibly set runtime PM status of request queue to "active" + * to make sure we can again get requests from the queue + * (see also blk_pm_peek_request()). + * + * The resume hook will correct runtime PM status of the disk. + */ + if (!err && scsi_is_sdev_device(dev)) { + struct scsi_device *sdev = to_scsi_device(dev); + + if (sdev->request_queue->dev) + blk_set_runtime_active(sdev->request_queue); + } }
return err; @@ -139,16 +153,6 @@ static int scsi_bus_resume_common(struct else fn = NULL;
- /* - * Forcibly set runtime PM status of request queue to "active" to - * make sure we can again get requests from the queue (see also - * blk_pm_peek_request()). - * - * The resume hook will correct runtime PM status of the disk. - */ - if (scsi_is_sdev_device(dev) && pm_runtime_suspended(dev)) - blk_set_runtime_active(to_scsi_device(dev)->request_queue); - if (fn) { async_schedule_domain(fn, dev, &scsi_sd_pm_domain);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ivan Mironov mironov.ivan@gmail.com
commit 44759979a49bfd2d20d789add7fa81a21eb1a4ab upstream.
Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may fail if device responds to MODE SENSE command with DPOFUA flag set, and then checks this flag to be not set on MODE SELECT command.
In this scenario, when trying to change cache_type, write always fails:
# echo "none" >cache_type bash: echo: write error: Invalid argument
And following appears in dmesg:
[13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current] [13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list
From SBC-4 r15, 6.5.1 "Mode pages overview", description of DEVICE-SPECIFIC
PARAMETER field in the mode parameter header: ... The write protect (WP) bit for mode data sent with a MODE SELECT command shall be ignored by the device server. ... The DPOFUA bit is reserved for mode data sent with a MODE SELECT command. ...
The remaining bits in the DEVICE-SPECIFIC PARAMETER byte are also reserved and shall be set to zero.
[mkp: shuffled commentary to commit description]
Cc: stable@vger.kernel.org Signed-off-by: Ivan Mironov mironov.ivan@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/sd.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -208,6 +208,12 @@ cache_type_store(struct device *dev, str sp = buffer_data[0] & 0x80 ? 1 : 0; buffer_data[0] &= ~0x80;
+ /* + * Ensure WP, DPOFUA, and RESERVED fields are cleared in + * received mode parameter buffer before doing MODE SELECT. + */ + data.device_specific = 0; + if (scsi_mode_select(sdp, 1, sp, 8, buffer_data, len, SD_TIMEOUT, SD_MAX_RETRIES, &data, &sshdr)) { if (scsi_sense_valid(&sshdr))
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit c56c2e173773097a248fd3bace91ac8f6fc5386d upstream.
This patch moves the mapping of IV after the kmalloc(). This avoids having to unmap in case kmalloc() fails.
Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Reviewed-by: Horia Geantă horia.geanta@nxp.com Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/talitos.c | 26 +++++++------------------- 1 file changed, 7 insertions(+), 19 deletions(-)
--- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -1347,23 +1347,18 @@ static struct talitos_edesc *talitos_ede struct talitos_private *priv = dev_get_drvdata(dev); bool is_sec1 = has_ftr_sec1(priv); int max_len = is_sec1 ? TALITOS1_MAX_DATA_LEN : TALITOS2_MAX_DATA_LEN; - void *err;
if (cryptlen + authsize > max_len) { dev_err(dev, "length exceeds h/w max limit\n"); return ERR_PTR(-EINVAL); }
- if (ivsize) - iv_dma = dma_map_single(dev, iv, ivsize, DMA_TO_DEVICE); - if (!dst || dst == src) { src_len = assoclen + cryptlen + authsize; src_nents = sg_nents_for_len(src, src_len); if (src_nents < 0) { dev_err(dev, "Invalid number of src SG.\n"); - err = ERR_PTR(-EINVAL); - goto error_sg; + return ERR_PTR(-EINVAL); } src_nents = (src_nents == 1) ? 0 : src_nents; dst_nents = dst ? src_nents : 0; @@ -1373,16 +1368,14 @@ static struct talitos_edesc *talitos_ede src_nents = sg_nents_for_len(src, src_len); if (src_nents < 0) { dev_err(dev, "Invalid number of src SG.\n"); - err = ERR_PTR(-EINVAL); - goto error_sg; + return ERR_PTR(-EINVAL); } src_nents = (src_nents == 1) ? 0 : src_nents; dst_len = assoclen + cryptlen + (encrypt ? authsize : 0); dst_nents = sg_nents_for_len(dst, dst_len); if (dst_nents < 0) { dev_err(dev, "Invalid number of dst SG.\n"); - err = ERR_PTR(-EINVAL); - goto error_sg; + return ERR_PTR(-EINVAL); } dst_nents = (dst_nents == 1) ? 0 : dst_nents; } @@ -1407,11 +1400,10 @@ static struct talitos_edesc *talitos_ede }
edesc = kmalloc(alloc_len, GFP_DMA | flags); - if (!edesc) { - dev_err(dev, "could not allocate edescriptor\n"); - err = ERR_PTR(-ENOMEM); - goto error_sg; - } + if (!edesc) + return ERR_PTR(-ENOMEM); + if (ivsize) + iv_dma = dma_map_single(dev, iv, ivsize, DMA_TO_DEVICE);
edesc->src_nents = src_nents; edesc->dst_nents = dst_nents; @@ -1423,10 +1415,6 @@ static struct talitos_edesc *talitos_ede DMA_BIDIRECTIONAL);
return edesc; -error_sg: - if (iv_dma) - dma_unmap_single(dev, iv_dma, ivsize, DMA_TO_DEVICE); - return err; }
static struct talitos_edesc *aead_edesc_alloc(struct aead_request *areq, u8 *iv,
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@c-s.fr
commit 1bea445b0a022ee126ca328b3705cd4df18ebc14 upstream.
[ 2.364486] WARNING: CPU: 0 PID: 60 at ./arch/powerpc/include/asm/io.h:837 dma_nommu_map_page+0x44/0xd4 [ 2.373579] CPU: 0 PID: 60 Comm: cryptomgr_test Tainted: G W 4.20.0-rc5-00560-g6bfb52e23a00-dirty #531 [ 2.384740] NIP: c000c540 LR: c000c584 CTR: 00000000 [ 2.389743] REGS: c95abab0 TRAP: 0700 Tainted: G W (4.20.0-rc5-00560-g6bfb52e23a00-dirty) [ 2.400042] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24042204 XER: 00000000 [ 2.406669] [ 2.406669] GPR00: c02f2244 c95abb60 c6262990 c95abd80 0000256a 00000001 00000001 00000001 [ 2.406669] GPR08: 00000000 00002000 00000010 00000010 24042202 00000000 00000100 c95abd88 [ 2.406669] GPR16: 00000000 c05569d4 00000001 00000010 c95abc88 c0615664 00000004 00000000 [ 2.406669] GPR24: 00000010 c95abc88 c95abc88 00000000 c61ae210 c7ff6d40 c61ae210 00003d68 [ 2.441559] NIP [c000c540] dma_nommu_map_page+0x44/0xd4 [ 2.446720] LR [c000c584] dma_nommu_map_page+0x88/0xd4 [ 2.451762] Call Trace: [ 2.454195] [c95abb60] [82000808] 0x82000808 (unreliable) [ 2.459572] [c95abb80] [c02f2244] talitos_edesc_alloc+0xbc/0x3c8 [ 2.465493] [c95abbb0] [c02f2600] ablkcipher_edesc_alloc+0x4c/0x5c [ 2.471606] [c95abbd0] [c02f4ed0] ablkcipher_encrypt+0x20/0x64 [ 2.477389] [c95abbe0] [c02023b0] __test_skcipher+0x4bc/0xa08 [ 2.483049] [c95abe00] [c0204b60] test_skcipher+0x2c/0xcc [ 2.488385] [c95abe20] [c0204c48] alg_test_skcipher+0x48/0xbc [ 2.494064] [c95abe40] [c0205cec] alg_test+0x164/0x2e8 [ 2.499142] [c95abf00] [c0200dec] cryptomgr_test+0x48/0x50 [ 2.504558] [c95abf10] [c0039ff4] kthread+0xe4/0x110 [ 2.509471] [c95abf40] [c000e1d0] ret_from_kernel_thread+0x14/0x1c [ 2.515532] Instruction dump: [ 2.518468] 7c7e1b78 7c9d2378 7cbf2b78 41820054 3d20c076 8089c200 3d20c076 7c84e850 [ 2.526127] 8129c204 7c842e70 7f844840 419c0008 <0fe00000> 2f9e0000 54847022 7c84fa14 [ 2.533960] ---[ end trace bf78d94af73fe3b8 ]--- [ 2.539123] talitos ff020000.crypto: master data transfer error [ 2.544775] talitos ff020000.crypto: TEA error: ISR 0x20000000_00000040 [ 2.551625] alg: skcipher: encryption failed on test 1 for ecb-aes-talitos: ret=22
IV cannot be on stack when CONFIG_VMAP_STACK is selected because the stack cannot be DMA mapped anymore.
This patch copies the IV into the extended descriptor.
Fixes: 4de9d0b547b9 ("crypto: talitos - Add ablkcipher algorithms") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy christophe.leroy@c-s.fr Reviewed-by: Horia Geantă horia.geanta@nxp.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/crypto/talitos.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/crypto/talitos.c +++ b/drivers/crypto/talitos.c @@ -1398,12 +1398,15 @@ static struct talitos_edesc *talitos_ede dma_len = 0; alloc_len += icv_stashing ? authsize : 0; } + alloc_len += ivsize;
edesc = kmalloc(alloc_len, GFP_DMA | flags); if (!edesc) return ERR_PTR(-ENOMEM); - if (ivsize) + if (ivsize) { + iv = memcpy(((u8 *)edesc) + alloc_len - ivsize, iv, ivsize); iv_dma = dma_map_single(dev, iv, ivsize, DMA_TO_DEVICE); + }
edesc->src_nents = src_nents; edesc->dst_nents = dst_nents;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit 5a9372f751b5350e0ce3d2ee91832f1feae2c2e5 upstream.
While reading through the sysvipc implementation, I noticed that the n32 semctl/shmctl/msgctl system calls behave differently based on whether o32 support is enabled or not: Without o32, the IPC_64 flag passed by user space is rejected but calls without that flag get IPC_64 behavior.
As far as I can tell, this was inadvertently changed by a cleanup patch but never noticed by anyone, possibly nobody has tried using sysvipc on n32 after linux-3.19.
Change it back to the old behavior now.
Fixes: 78aaf956ba3a ("MIPS: Compat: Fix build error if CONFIG_MIPS32_COMPAT but no compat ABI.") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Paul Burton paul.burton@mips.com Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -3135,6 +3135,7 @@ config MIPS32_O32 config MIPS32_N32 bool "Kernel support for n32 binaries" depends on 64BIT + select ARCH_WANT_COMPAT_IPC_PARSE_VERSION select COMPAT select MIPS32_COMPAT select SYSVIPC_COMPAT if SYSVIPC
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Hunter jonathanh@nvidia.com
commit ac4ca4b9f4623ba5e1ea7a582f286567c611e027 upstream.
The tps6586x driver creates an irqchip that is used by its various child devices for managing interrupts. The tps6586x-rtc device is one of its children that uses the tps6586x irqchip. When using the tps6586x-rtc as a wake-up device from suspend, the following is seen:
PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.001 seconds) done. OOM killer disabled. Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done. Disabling non-boot CPUs ... Entering suspend state LP1 Enabling non-boot CPUs ... CPU1 is up tps6586x 3-0034: failed to read interrupt status tps6586x 3-0034: failed to read interrupt status
The reason why the tps6586x interrupt status cannot be read is because the tps6586x interrupt is not masked during suspend and when the tps6586x-rtc interrupt occurs, to wake-up the device, the interrupt is seen before the i2c controller has been resumed in order to read the tps6586x interrupt status.
The tps6586x-rtc driver sets it's interrupt as a wake-up source during suspend, which gets propagated to the parent tps6586x interrupt. However, the tps6586x-rtc driver cannot disable it's interrupt during suspend otherwise we would never be woken up and so the tps6586x must disable it's interrupt instead.
Prevent the tps6586x interrupt handler from executing on exiting suspend before the i2c controller has been resumed by disabling the tps6586x interrupt on entering suspend and re-enabling it on resuming from suspend.
Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter jonathanh@nvidia.com Reviewed-by: Dmitry Osipenko digetx@gmail.com Tested-by: Dmitry Osipenko digetx@gmail.com Acked-by: Thierry Reding treding@nvidia.com Signed-off-by: Lee Jones lee.jones@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mfd/tps6586x.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/drivers/mfd/tps6586x.c +++ b/drivers/mfd/tps6586x.c @@ -594,6 +594,29 @@ static int tps6586x_i2c_remove(struct i2 return 0; }
+static int __maybe_unused tps6586x_i2c_suspend(struct device *dev) +{ + struct tps6586x *tps6586x = dev_get_drvdata(dev); + + if (tps6586x->client->irq) + disable_irq(tps6586x->client->irq); + + return 0; +} + +static int __maybe_unused tps6586x_i2c_resume(struct device *dev) +{ + struct tps6586x *tps6586x = dev_get_drvdata(dev); + + if (tps6586x->client->irq) + enable_irq(tps6586x->client->irq); + + return 0; +} + +static SIMPLE_DEV_PM_OPS(tps6586x_pm_ops, tps6586x_i2c_suspend, + tps6586x_i2c_resume); + static const struct i2c_device_id tps6586x_id_table[] = { { "tps6586x", 0 }, { }, @@ -604,6 +627,7 @@ static struct i2c_driver tps6586x_driver .driver = { .name = "tps6586x", .of_match_table = of_match_ptr(tps6586x_of_match), + .pm = &tps6586x_pm_ops, }, .probe = tps6586x_i2c_probe, .remove = tps6586x_i2c_remove,
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ard Biesheuvel ard.biesheuvel@linaro.org
commit 1598ecda7b239e9232dda032bfddeed9d89fab6c upstream.
kaslr_early_init() is called with the kernel mapped at its link time offset, and if it returns with a non-zero offset, the kernel is unmapped and remapped again at the randomized offset.
During its execution, kaslr_early_init() also randomizes the base of the module region and of the linear mapping of DRAM, and sets two variables accordingly. However, since these variables are assigned with the caches on, they may get lost during the cache maintenance that occurs when unmapping and remapping the kernel, so ensure that these values are cleaned to the PoC.
Acked-by: Catalin Marinas catalin.marinas@arm.com Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Signed-off-by: Will Deacon will.deacon@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/kaslr.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/arm64/kernel/kaslr.c +++ b/arch/arm64/kernel/kaslr.c @@ -14,6 +14,7 @@ #include <linux/sched.h> #include <linux/types.h>
+#include <asm/cacheflush.h> #include <asm/fixmap.h> #include <asm/kernel-pgtable.h> #include <asm/memory.h> @@ -43,7 +44,7 @@ static __init u64 get_kaslr_seed(void *f return ret; }
-static __init const u8 *get_cmdline(void *fdt) +static __init const u8 *kaslr_get_cmdline(void *fdt) { static __initconst const u8 default_cmdline[] = CONFIG_CMDLINE;
@@ -109,7 +110,7 @@ u64 __init kaslr_early_init(u64 dt_phys, * Check if 'nokaslr' appears on the command line, and * return 0 if that is the case. */ - cmdline = get_cmdline(fdt); + cmdline = kaslr_get_cmdline(fdt); str = strstr(cmdline, "nokaslr"); if (str == cmdline || (str > cmdline && *(str - 1) == ' ')) return 0; @@ -178,5 +179,8 @@ u64 __init kaslr_early_init(u64 dt_phys, module_alloc_base += (module_range * (seed & ((1 << 21) - 1))) >> 21; module_alloc_base &= PAGE_MASK;
+ __flush_dcache_area(&module_alloc_base, sizeof(module_alloc_base)); + __flush_dcache_area(&memstart_offset_seed, sizeof(memstart_offset_seed)); + return offset; }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: YunQiang Su ysu@wavecomp.com
commit a214720cbf50cd8c3f76bbb9c3f5c283910e9d33 upstream.
Octeon has an boot-time option to disable pcie.
Since MSI depends on PCI-E, we should also disable MSI also with this option is on in order to avoid inadvertently accessing PCIe registers.
Signed-off-by: YunQiang Su ysu@wavecomp.com Signed-off-by: Paul Burton paul.burton@mips.com Cc: pburton@wavecomp.com Cc: linux-mips@vger.kernel.org Cc: aaro.koskinen@iki.fi Cc: stable@vger.kernel.org # v3.3+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/mips/pci/msi-octeon.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/mips/pci/msi-octeon.c +++ b/arch/mips/pci/msi-octeon.c @@ -369,7 +369,9 @@ int __init octeon_msi_initialize(void) int irq; struct irq_chip *msi;
- if (octeon_dma_bar_type == OCTEON_DMA_BAR_TYPE_PCIE) { + if (octeon_dma_bar_type == OCTEON_DMA_BAR_TYPE_INVALID) { + return 0; + } else if (octeon_dma_bar_type == OCTEON_DMA_BAR_TYPE_PCIE) { msi_rcv_reg[0] = CVMX_PEXP_NPEI_MSI_RCV0; msi_rcv_reg[1] = CVMX_PEXP_NPEI_MSI_RCV1; msi_rcv_reg[2] = CVMX_PEXP_NPEI_MSI_RCV2;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Tsyrklevich vlad@tsyrklevich.net
commit a01421e4484327fe44f8e126793ed5a48a221e24 upstream.
Using [1] for static analysis I found that the OMAPFB_QUERY_PLANE, OMAPFB_GET_COLOR_KEY, OMAPFB_GET_DISPLAY_INFO, and OMAPFB_GET_VRAM_INFO cases could all leak uninitialized stack memory--either due to uninitialized padding or 'reserved' fields.
Fix them by clearing the shared union used to store copied out data.
[1] https://github.com/vlad902/kernel-uninitialized-memory-checker
Signed-off-by: Vlad Tsyrklevich vlad@tsyrklevich.net Reviewed-by: Kees Cook keescook@chromium.org Fixes: b39a982ddecf ("OMAP: DSS2: omapfb driver") Cc: security@kernel.org [b.zolnierkie: prefix patch subject with "omap2fb: "] Signed-off-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c +++ b/drivers/video/fbdev/omap2/omapfb/omapfb-ioctl.c @@ -609,6 +609,8 @@ int omapfb_ioctl(struct fb_info *fbi, un
int r = 0;
+ memset(&p, 0, sizeof(p)); + switch (cmd) { case OMAPFB_SYNC_GFX: DBG("ioctl SYNC_GFX\n");
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
commit 701f49bc028edb19ffccd101997dd84f0d71e279 upstream.
kthread_run returns an error pointer, but elsewhere in the code dev->kthread_vid_cap/out is checked against NULL.
If kthread_run returns an error, then set the pointer to NULL.
I chose this method over changing all kthread_vid_cap/out tests elsewhere since this is more robust.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Reported-by: syzbot+53d5b2df0d9744411e2e@syzkaller.appspotmail.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vivid/vivid-kthread-cap.c | 5 ++++- drivers/media/platform/vivid/vivid-kthread-out.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/media/platform/vivid/vivid-kthread-cap.c +++ b/drivers/media/platform/vivid/vivid-kthread-cap.c @@ -877,8 +877,11 @@ int vivid_start_generating_vid_cap(struc "%s-vid-cap", dev->v4l2_dev.name);
if (IS_ERR(dev->kthread_vid_cap)) { + int err = PTR_ERR(dev->kthread_vid_cap); + + dev->kthread_vid_cap = NULL; v4l2_err(&dev->v4l2_dev, "kernel_thread() failed\n"); - return PTR_ERR(dev->kthread_vid_cap); + return err; } *pstreaming = true; vivid_grab_controls(dev, true); --- a/drivers/media/platform/vivid/vivid-kthread-out.c +++ b/drivers/media/platform/vivid/vivid-kthread-out.c @@ -248,8 +248,11 @@ int vivid_start_generating_vid_out(struc "%s-vid-out", dev->v4l2_dev.name);
if (IS_ERR(dev->kthread_vid_out)) { + int err = PTR_ERR(dev->kthread_vid_out); + + dev->kthread_vid_out = NULL; v4l2_err(&dev->v4l2_dev, "kernel_thread() failed\n"); - return PTR_ERR(dev->kthread_vid_out); + return err; } *pstreaming = true; vivid_grab_controls(dev, true);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
commit 9729d6d282a6d7ce88e64c9119cecdf79edf4e88 upstream.
The capture DV timings capabilities allowed for a minimum width and height of 0. So passing a timings struct with 0 values is allowed and will later cause a division by zero.
Ensure that the width and height must be >= 16 to avoid this.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Reported-by: syzbot+57c3d83d71187054d56f@syzkaller.appspotmail.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/platform/vivid/vivid-vid-common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/vivid/vivid-vid-common.c +++ b/drivers/media/platform/vivid/vivid-vid-common.c @@ -33,7 +33,7 @@ const struct v4l2_dv_timings_cap vivid_d .type = V4L2_DV_BT_656_1120, /* keep this initialization for compatibility with GCC < 4.4.6 */ .reserved = { 0 }, - V4L2_INIT_BT_TIMINGS(0, MAX_WIDTH, 0, MAX_HEIGHT, 14000000, 775000000, + V4L2_INIT_BT_TIMINGS(16, MAX_WIDTH, 16, MAX_HEIGHT, 14000000, 775000000, V4L2_DV_BT_STD_CEA861 | V4L2_DV_BT_STD_DMT | V4L2_DV_BT_STD_CVT | V4L2_DV_BT_STD_GTF, V4L2_DV_BT_CAP_PROGRESSIVE | V4L2_DV_BT_CAP_INTERLACED)
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morris james.morris@microsoft.com
commit a5795fd38ee8194451ba3f281f075301a3696ce2 upstream.
From: Casey Schaufler casey@schaufler-ca.com
Check that the cred security blob has been set before trying to clean it up. There is a case during credential initialization that could result in this.
Signed-off-by: Casey Schaufler casey@schaufler-ca.com Acked-by: John Johansen john.johansen@canonical.com Signed-off-by: James Morris james.morris@microsoft.com Reported-by: syzbot+69ca07954461f189e808@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/security.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/security/security.c +++ b/security/security.c @@ -904,6 +904,13 @@ int security_cred_alloc_blank(struct cre
void security_cred_free(struct cred *cred) { + /* + * There is a failure case in prepare_creds() that + * may result in a call here with ->security being NULL. + */ + if (unlikely(cred->security == NULL)) + return; + call_void_hook(cred_free, cred); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil@xs4all.nl
commit cd26d1c4d1bc947b56ae404998ae2276df7b39b7 upstream.
If a filehandle is dup()ped, then it is possible to close it from one fd and call mmap from the other. This creates a race condition in vb2_mmap where it is using queue data that __vb2_queue_free (called from close()) is in the process of releasing.
By moving up the mutex_lock(mmap_lock) in vb2_mmap this race is avoided since __vb2_queue_free is called with the same mutex locked. So vb2_mmap now reads consistent buffer data.
Signed-off-by: Hans Verkuil hverkuil@xs4all.nl Reported-by: syzbot+be93025dd45dccd8923c@syzkaller.appspotmail.com Signed-off-by: Hans Verkuil hansverk@cisco.com Signed-off-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/media/v4l2-core/videobuf2-core.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/media/v4l2-core/videobuf2-core.c +++ b/drivers/media/v4l2-core/videobuf2-core.c @@ -1916,9 +1916,13 @@ int vb2_mmap(struct vb2_queue *q, struct return -EINVAL; } } + + mutex_lock(&q->mmap_lock); + if (vb2_fileio_is_active(q)) { dprintk(1, "mmap: file io in progress\n"); - return -EBUSY; + ret = -EBUSY; + goto unlock; }
/* @@ -1926,7 +1930,7 @@ int vb2_mmap(struct vb2_queue *q, struct */ ret = __find_plane_by_offset(q, off, &buffer, &plane); if (ret) - return ret; + goto unlock;
vb = q->bufs[buffer];
@@ -1942,8 +1946,9 @@ int vb2_mmap(struct vb2_queue *q, struct return -EINVAL; }
- mutex_lock(&q->mmap_lock); ret = call_memop(vb, mmap, vb->planes[plane].mem_priv, vma); + +unlock: mutex_unlock(&q->mmap_lock); if (ret) return ret;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: J. Bruce Fields bfields@redhat.com
commit 81c88b18de1f11f70c97f28ced8d642c00bb3955 upstream.
If we ignore the error we'll hit a null dereference a little later.
Reported-by: syzbot+4b98281f2401ab849f4b@syzkaller.appspotmail.com Signed-off-by: J. Bruce Fields bfields@redhat.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sunrpc/rpcb_clnt.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/net/sunrpc/rpcb_clnt.c +++ b/net/sunrpc/rpcb_clnt.c @@ -770,6 +770,12 @@ void rpcb_getport_async(struct rpc_task case RPCBVERS_3: map->r_netid = xprt->address_strings[RPC_DISPLAY_NETID]; map->r_addr = rpc_sockaddr2uaddr(sap, GFP_ATOMIC); + if (!map->r_addr) { + status = -ENOMEM; + dprintk("RPC: %5u %s: no memory available\n", + task->tk_pid, __func__); + goto bailout_free_args; + } map->r_owner = ""; break; case RPCBVERS_2: @@ -792,6 +798,8 @@ void rpcb_getport_async(struct rpc_task rpc_put_task(child); return;
+bailout_free_args: + kfree(map); bailout_release_client: rpc_release_client(rpcb_clnt); bailout_nofree:
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shakeel Butt shakeelb@google.com
commit e2c8d550a973bb34fc28bc8d0ec996f84562fb8a upstream.
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying memory is already accounted to kmemcg. Do the same for ebtables. The syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the whole system from a restricted memcg, a potential DoS.
By accounting the ebt_table_info, the memory used for ebt_table_info can be contained within the memcg of the allocating process. However the lifetime of ebt_table_info is independent of the allocating process and is tied to the network namespace. So, the oom-killer will not be able to relieve the memory pressure due to ebt_table_info memory. The memory for ebt_table_info is allocated through vmalloc. Currently vmalloc does not handle the oom-killed allocating process correctly and one large allocation can bypass memcg limit enforcement. So, with this patch, at least the small allocations will be contained. For large allocations, we need to fix vmalloc.
Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt shakeelb@google.com Reviewed-by: Kirill Tkhai ktkhai@virtuozzo.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/bridge/netfilter/ebtables.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -1147,14 +1147,16 @@ static int do_replace(struct net *net, c tmp.name[sizeof(tmp.name) - 1] = 0;
countersize = COUNTER_OFFSET(tmp.nentries) * nr_cpu_ids; - newinfo = vmalloc(sizeof(*newinfo) + countersize); + newinfo = __vmalloc(sizeof(*newinfo) + countersize, GFP_KERNEL_ACCOUNT, + PAGE_KERNEL); if (!newinfo) return -ENOMEM;
if (countersize) memset(newinfo->counters, 0, countersize);
- newinfo->entries = vmalloc(tmp.entries_size); + newinfo->entries = __vmalloc(tmp.entries_size, GFP_KERNEL_ACCOUNT, + PAGE_KERNEL); if (!newinfo->entries) { ret = -ENOMEM; goto free_newinfo;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Smalley sds@tycho.nsa.gov
commit 5b0e7310a2a33c06edc7eb81ffc521af9b2c5610 upstream.
levdatum->level can be NULL if we encounter an error while loading the policy during sens_read prior to initializing it. Make sure sens_destroy handles that case correctly.
Reported-by: syzbot+6664500f0f18f07a5c0e@syzkaller.appspotmail.com Signed-off-by: Stephen Smalley sds@tycho.nsa.gov Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/selinux/ss/policydb.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/security/selinux/ss/policydb.c +++ b/security/selinux/ss/policydb.c @@ -726,7 +726,8 @@ static int sens_destroy(void *key, void kfree(key); if (datum) { levdatum = datum; - ebitmap_destroy(&levdatum->level->cat); + if (levdatum->level) + ebitmap_destroy(&levdatum->level->cat); kfree(levdatum->level); } kfree(datum);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 04906b2f542c23626b0ef6219b808406f8dddbe9 upstream.
bd_set_size() updates also block device's block size. This is somewhat unexpected from its name and at this point, only blkdev_open() uses this functionality. Furthermore, this can result in changing block size under a filesystem mounted on a loop device which leads to livelocks inside __getblk_gfp() like:
Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 10863 Comm: syz-executor0 Not tainted 4.18.0-rc5+ #151 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x3f/0x50 kernel/kcov.c:106 ... Call Trace: init_page_buffers+0x3e2/0x530 fs/buffer.c:904 grow_dev_page fs/buffer.c:947 [inline] grow_buffers fs/buffer.c:1009 [inline] __getblk_slow fs/buffer.c:1036 [inline] __getblk_gfp+0x906/0xb10 fs/buffer.c:1313 __bread_gfp+0x2d/0x310 fs/buffer.c:1347 sb_bread include/linux/buffer_head.h:307 [inline] fat12_ent_bread+0x14e/0x3d0 fs/fat/fatent.c:75 fat_ent_read_block fs/fat/fatent.c:441 [inline] fat_alloc_clusters+0x8ce/0x16e0 fs/fat/fatent.c:489 fat_add_cluster+0x7a/0x150 fs/fat/inode.c:101 __fat_get_block fs/fat/inode.c:148 [inline] ...
Trivial reproducer for the problem looks like:
truncate -s 1G /tmp/image losetup /dev/loop0 /tmp/image mkfs.ext4 -b 1024 /dev/loop0 mount -t ext4 /dev/loop0 /mnt losetup -c /dev/loop0 l /mnt
Fix the problem by moving initialization of a block device block size into a separate function and call it when needed.
Thanks to Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp for help with debugging the problem.
Reported-by: syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/block_dev.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-)
--- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -114,6 +114,20 @@ void invalidate_bdev(struct block_device } EXPORT_SYMBOL(invalidate_bdev);
+static void set_init_blocksize(struct block_device *bdev) +{ + unsigned bsize = bdev_logical_block_size(bdev); + loff_t size = i_size_read(bdev->bd_inode); + + while (bsize < PAGE_SIZE) { + if (size & bsize) + break; + bsize <<= 1; + } + bdev->bd_block_size = bsize; + bdev->bd_inode->i_blkbits = blksize_bits(bsize); +} + int set_blocksize(struct block_device *bdev, int size) { /* Size must be a power of two, and between 512 and PAGE_SIZE */ @@ -1209,18 +1223,9 @@ EXPORT_SYMBOL(check_disk_change);
void bd_set_size(struct block_device *bdev, loff_t size) { - unsigned bsize = bdev_logical_block_size(bdev); - inode_lock(bdev->bd_inode); i_size_write(bdev->bd_inode, size); inode_unlock(bdev->bd_inode); - while (bsize < PAGE_SIZE) { - if (size & bsize) - break; - bsize <<= 1; - } - bdev->bd_block_size = bsize; - bdev->bd_inode->i_blkbits = blksize_bits(bsize); } EXPORT_SYMBOL(bd_set_size);
@@ -1297,8 +1302,10 @@ static int __blkdev_get(struct block_dev } }
- if (!ret) + if (!ret) { bd_set_size(bdev,(loff_t)get_capacity(disk)<<9); + set_init_blocksize(bdev); + }
/* * If the device is invalidated, rescan partition @@ -1333,6 +1340,7 @@ static int __blkdev_get(struct block_dev goto out_clear; } bd_set_size(bdev, (loff_t)bdev->bd_part->nr_sects << 9); + set_init_blocksize(bdev); } } else { if (bdev->bd_contains == bdev) {
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Long lucien.xin@gmail.com
commit 400b8b9a2a17918f8ce00786f596f530e7f30d50 upstream.
The similar issue as fixed in Commit 4a2eb0c37b47 ("sctp: initialize sin6_flowinfo for ipv6 addrs in sctp_inet6addr_event") also exists in sctp_inetaddr_event, as Alexander noticed.
To fix it, allocate sctp_sockaddr_entry with kzalloc for both sctp ipv4 and ipv6 addresses, as does in sctp_v4/6_copy_addrlist().
Reported-by: Alexander Potapenko glider@google.com Signed-off-by: Xin Long lucien.xin@gmail.com Reported-by: syzbot+ae0c70c0c2d40c51bb92@syzkaller.appspotmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Acked-by: Neil Horman nhorman@tuxdriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sctp/ipv6.c | 5 +---- net/sctp/protocol.c | 4 +--- 2 files changed, 2 insertions(+), 7 deletions(-)
--- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -97,11 +97,9 @@ static int sctp_inet6addr_event(struct n
switch (ev) { case NETDEV_UP: - addr = kmalloc(sizeof(struct sctp_sockaddr_entry), GFP_ATOMIC); + addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v6.sin6_family = AF_INET6; - addr->a.v6.sin6_port = 0; - addr->a.v6.sin6_flowinfo = 0; addr->a.v6.sin6_addr = ifa->addr; addr->a.v6.sin6_scope_id = ifa->idev->dev->ifindex; addr->valid = 1; @@ -413,7 +411,6 @@ static void sctp_v6_copy_addrlist(struct addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v6.sin6_family = AF_INET6; - addr->a.v6.sin6_port = 0; addr->a.v6.sin6_addr = ifp->addr; addr->a.v6.sin6_scope_id = dev->ifindex; addr->valid = 1; --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -151,7 +151,6 @@ static void sctp_v4_copy_addrlist(struct addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v4.sin_family = AF_INET; - addr->a.v4.sin_port = 0; addr->a.v4.sin_addr.s_addr = ifa->ifa_local; addr->valid = 1; INIT_LIST_HEAD(&addr->list); @@ -777,10 +776,9 @@ static int sctp_inetaddr_event(struct no
switch (ev) { case NETDEV_UP: - addr = kmalloc(sizeof(struct sctp_sockaddr_entry), GFP_ATOMIC); + addr = kzalloc(sizeof(*addr), GFP_ATOMIC); if (addr) { addr->a.v4.sin_family = AF_INET; - addr->a.v4.sin_port = 0; addr->a.v4.sin_addr.s_addr = ifa->ifa_local; addr->valid = 1; spin_lock_bh(&net->sctp.local_addr_lock);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 8b66fee7f8ee18f9c51260e7a43ab37db5177a05 upstream.
syzbot reports following splat:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486 CPU: 1 PID: 11057 Comm: syz-executor0 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:295 strlen+0x3b/0xa0 lib/string.c:486 nla_put_string include/net/netlink.h:1154 [inline] tipc_nl_compat_link_reset_stats+0x1f0/0x360 net/tipc/netlink_compat.c:760 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline] tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x457ec9 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2557338c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f25573396d4 R13: 00000000004cb478 R14: 00000000004d86c8 R15: 00000000ffffffff
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline] kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2759 [inline] __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383 __kmalloc_reserve net/core/skbuff.c:137 [inline] __alloc_skb+0x309/0xa20 net/core/skbuff.c:205 alloc_skb include/linux/skbuff.h:998 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0xb82/0x1300 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
The uninitialised access happened in tipc_nl_compat_link_reset_stats: nla_put_string(skb, TIPC_NLA_LINK_NAME, name)
This is because name string is not validated before it's used.
Reported-by: syzbot+e01d94b5a4c266be6e4c@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -87,6 +87,11 @@ static int tipc_skb_tailroom(struct sk_b return limit; }
+static inline int TLV_GET_DATA_LEN(struct tlv_desc *tlv) +{ + return TLV_GET_LEN(tlv) - TLV_SPACE(0); +} + static int tipc_add_tlv(struct sk_buff *skb, u16 type, void *data, u16 len) { struct tlv_desc *tlv = (struct tlv_desc *)skb_tail_pointer(skb); @@ -166,6 +171,11 @@ static struct sk_buff *tipc_get_err_tlv( return buf; }
+static inline bool string_is_valid(char *s, int len) +{ + return memchr(s, '\0', len) ? true : false; +} + static int __tipc_nl_compat_dumpit(struct tipc_nl_compat_cmd_dump *cmd, struct tipc_nl_compat_msg *msg, struct sk_buff *arg) @@ -741,6 +751,7 @@ static int tipc_nl_compat_link_reset_sta { char *name; struct nlattr *link; + int len;
name = (char *)TLV_DATA(msg->req);
@@ -748,6 +759,10 @@ static int tipc_nl_compat_link_reset_sta if (!link) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + if (!string_is_valid(name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_LINK_NAME, name)) return -EMSGSIZE;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 0762216c0ad2a2fccd63890648eca491f2c83d9a upstream.
syzbot reported:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:484 CPU: 1 PID: 6371 Comm: syz-executor652 Not tainted 4.19.0-rc8+ #70 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x306/0x460 lib/dump_stack.c:113 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917 __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500 strlen+0x3b/0xa0 lib/string.c:484 nla_put_string include/net/netlink.h:1011 [inline] tipc_nl_compat_bearer_enable+0x238/0x7b0 net/tipc/netlink_compat.c:389 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x39f/0xae0 net/tipc/netlink_compat.c:344 tipc_nl_compat_recv+0x147c/0x2760 net/tipc/netlink_compat.c:1107 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x440179 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffef7beee8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00 R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline] kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180 kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2727 [inline] __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x422/0xe90 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:996 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline] netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
The root cause is that we don't validate whether bear name is a valid string in tipc_nl_compat_bearer_enable().
Meanwhile, we also fix the same issue in the following functions: tipc_nl_compat_bearer_disable() tipc_nl_compat_link_stat_dump() tipc_nl_compat_media_set() tipc_nl_compat_bearer_set()
Reported-by: syzbot+b33d5cae0efd35dbfe77@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -380,6 +380,7 @@ static int tipc_nl_compat_bearer_enable( struct nlattr *prop; struct nlattr *bearer; struct tipc_bearer_config *b; + int len;
b = (struct tipc_bearer_config *)TLV_DATA(msg->req);
@@ -387,6 +388,10 @@ static int tipc_nl_compat_bearer_enable( if (!bearer) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); + if (!string_is_valid(b->name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_BEARER_NAME, b->name)) return -EMSGSIZE;
@@ -412,6 +417,7 @@ static int tipc_nl_compat_bearer_disable { char *name; struct nlattr *bearer; + int len;
name = (char *)TLV_DATA(msg->req);
@@ -419,6 +425,10 @@ static int tipc_nl_compat_bearer_disable if (!bearer) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME); + if (!string_is_valid(name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_BEARER_NAME, name)) return -EMSGSIZE;
@@ -479,6 +489,7 @@ static int tipc_nl_compat_link_stat_dump struct nlattr *prop[TIPC_NLA_PROP_MAX + 1]; struct nlattr *stats[TIPC_NLA_STATS_MAX + 1]; int err; + int len;
if (!attrs[TIPC_NLA_LINK]) return -EINVAL; @@ -505,6 +516,11 @@ static int tipc_nl_compat_link_stat_dump return err;
name = (char *)TLV_DATA(msg->req); + + len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + if (!string_is_valid(name, len)) + return -EINVAL; + if (strcmp(name, nla_data(link[TIPC_NLA_LINK_NAME])) != 0) return 0;
@@ -645,6 +661,7 @@ static int tipc_nl_compat_media_set(stru struct nlattr *prop; struct nlattr *media; struct tipc_link_config *lc; + int len;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
@@ -652,6 +669,10 @@ static int tipc_nl_compat_media_set(stru if (!media) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_MEDIA_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_MEDIA_NAME, lc->name)) return -EMSGSIZE;
@@ -672,6 +693,7 @@ static int tipc_nl_compat_bearer_set(str struct nlattr *prop; struct nlattr *bearer; struct tipc_link_config *lc; + int len;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
@@ -679,6 +701,10 @@ static int tipc_nl_compat_bearer_set(str if (!bearer) return -EMSGSIZE;
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_MEDIA_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + if (nla_put_string(skb, TIPC_NLA_BEARER_NAME, lc->name)) return -EMSGSIZE;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit edf5ff04a45750ac8ce2435974f001dc9cfbf055 upstream.
syzbot reports following splat:
BUG: KMSAN: uninit-value in strlen+0x3b/0xa0 lib/string.c:486 CPU: 1 PID: 9306 Comm: syz-executor172 Not tainted 4.20.0-rc7+ #2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 strlen+0x3b/0xa0 lib/string.c:486 nla_put_string include/net/netlink.h:1154 [inline] __tipc_nl_compat_link_set net/tipc/netlink_compat.c:708 [inline] tipc_nl_compat_link_set+0x929/0x1220 net/tipc/netlink_compat.c:744 __tipc_nl_compat_doit net/tipc/netlink_compat.c:311 [inline] tipc_nl_compat_doit+0x3aa/0xaf0 net/tipc/netlink_compat.c:344 tipc_nl_compat_handle net/tipc/netlink_compat.c:1107 [inline] tipc_nl_compat_recv+0x14d7/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185f/0x1a60 net/netlink/genetlink.c:626 netlink_rcv_skb+0x444/0x640 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0xf40/0x1020 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x127f/0x1300 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xdb9/0x11b0 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
The uninitialised access happened in nla_put_string(skb, TIPC_NLA_LINK_NAME, lc->name)
This is because lc->name string is not validated before it's used.
Reported-by: syzbot+d78b8a29241a195aefb8@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -753,9 +753,14 @@ static int tipc_nl_compat_link_set(struc struct tipc_link_config *lc; struct tipc_bearer *bearer; struct tipc_media *media; + int len;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
+ len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME); + if (!string_is_valid(lc->name, len)) + return -EINVAL; + media = tipc_media_find(lc->name); if (media) { cmd->doit = &tipc_nl_media_set;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 974cb0e3e7c963ced06c4e32c5b2884173fa5e01 upstream.
syzbot reported:
BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline] BUG: KMSAN: uninit-value in tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826 CPU: 0 PID: 6290 Comm: syz-executor848 Not tainted 4.19.0-rc8+ #70 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x306/0x460 lib/dump_stack.c:113 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:917 __msan_warning+0x7c/0xe0 mm/kmsan/kmsan_instr.c:500 __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] __fswab32 include/uapi/linux/swab.h:59 [inline] tipc_nl_compat_name_table_dump+0x4a8/0xba0 net/tipc/netlink_compat.c:826 __tipc_nl_compat_dumpit+0x59e/0xdb0 net/tipc/netlink_compat.c:205 tipc_nl_compat_dumpit+0x63a/0x820 net/tipc/netlink_compat.c:270 tipc_nl_compat_handle net/tipc/netlink_compat.c:1151 [inline] tipc_nl_compat_recv+0x1402/0x2760 net/tipc/netlink_compat.c:1210 genl_family_rcv_msg net/netlink/genetlink.c:601 [inline] genl_rcv_msg+0x185c/0x1a20 net/netlink/genetlink.c:626 netlink_rcv_skb+0x394/0x640 net/netlink/af_netlink.c:2454 genl_rcv+0x63/0x80 net/netlink/genetlink.c:637 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x166d/0x1720 net/netlink/af_netlink.c:1343 netlink_sendmsg+0x1391/0x1420 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x440179 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffecec49318 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440179 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8 R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000401a00 R13: 0000000000401a90 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:255 [inline] kmsan_internal_poison_shadow+0xc8/0x1d0 mm/kmsan/kmsan.c:180 kmsan_kmalloc+0xa4/0x120 mm/kmsan/kmsan_hooks.c:104 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan_hooks.c:113 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2727 [inline] __kmalloc_node_track_caller+0xb43/0x1400 mm/slub.c:4360 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x422/0xe90 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:996 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1189 [inline] netlink_sendmsg+0xcaf/0x1420 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg net/socket.c:631 [inline] ___sys_sendmsg+0xe47/0x1200 net/socket.c:2116 __sys_sendmsg net/socket.c:2154 [inline] __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg+0x307/0x460 net/socket.c:2161 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2161 do_syscall_64+0xbe/0x100 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7
We cannot take for granted the thing that the length of data contained in TLV is longer than the size of struct tipc_name_table_query in tipc_nl_compat_name_table_dump().
Reported-by: syzbot+06e771a754829716a327@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -815,6 +815,8 @@ static int tipc_nl_compat_name_table_dum };
ntq = (struct tipc_name_table_query *)TLV_DATA(msg->req); + if (TLV_GET_DATA_LEN(msg->req) < sizeof(struct tipc_name_table_query)) + return -EINVAL;
depth = ntohl(ntq->depth);
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ying Xue ying.xue@windriver.com
commit 2753ca5d9009c180dbfd4c802c80983b4b6108d1 upstream.
BUG: KMSAN: uninit-value in tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335 CPU: 0 PID: 4514 Comm: syz-executor485 Not tainted 4.16.0+ #87 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 tipc_nl_compat_doit+0x404/0xa10 net/tipc/netlink_compat.c:335 tipc_nl_compat_recv+0x164b/0x2700 net/tipc/netlink_compat.c:1153 genl_family_rcv_msg net/netlink/genetlink.c:599 [inline] genl_rcv_msg+0x1686/0x1810 net/netlink/genetlink.c:624 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2447 genl_rcv+0x63/0x80 net/netlink/genetlink.c:635 netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline] netlink_unicast+0x166b/0x1740 net/netlink/af_netlink.c:1337 netlink_sendmsg+0x1048/0x1310 net/netlink/af_netlink.c:1900 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmsg net/socket.c:2080 [inline] SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091 SyS_sendmsg+0x54/0x80 net/socket.c:2087 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43fda9 RSP: 002b:00007ffd0c184ba8 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fda9 RDX: 0000000000000000 RSI: 0000000020023000 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 00000000004016d0 R13: 0000000000401760 R14: 0000000000000000 R15: 0000000000000000
Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1183 [inline] netlink_sendmsg+0x9a6/0x1310 net/netlink/af_netlink.c:1875 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046 __sys_sendmsg net/socket.c:2080 [inline] SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091 SyS_sendmsg+0x54/0x80 net/socket.c:2087 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
In tipc_nl_compat_recv(), when the len variable returned by nlmsg_attrlen() is 0, the message is still treated as a valid one, which is obviously unresonable. When len is zero, it means the message not only doesn't contain any valid TLV payload, but also TLV header is not included. Under this stituation, tlv_type field in TLV header is still accessed in tipc_nl_compat_dumpit() or tipc_nl_compat_doit(), but the field space is obviously illegal. Of course, it is not initialized.
Reported-by: syzbot+bca0dc46634781f08b38@syzkaller.appspotmail.com Reported-by: syzbot+6bdb590321a7ae40c1a6@syzkaller.appspotmail.com Signed-off-by: Ying Xue ying.xue@windriver.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/tipc/netlink_compat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -1240,7 +1240,7 @@ static int tipc_nl_compat_recv(struct sk }
len = nlmsg_attrlen(req_nlh, GENL_HDRLEN + TIPC_GENL_HDRLEN); - if (len && !TLV_OK(msg.req, len)) { + if (!len || !TLV_OK(msg.req, len)) { msg.rep = tipc_get_err_tlv(TIPC_CFG_NOT_SUPPORTED); err = -EOPNOTSUPP; goto send;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 310ca162d779efee8a2dc3731439680f3e9c1e86 upstream.
syzbot is reporting NULL pointer dereference [1] which is caused by race condition between ioctl(loop_fd, LOOP_CLR_FD, 0) versus ioctl(other_loop_fd, LOOP_SET_FD, loop_fd) due to traversing other loop devices at loop_validate_file() without holding corresponding lo->lo_ctl_mutex locks.
Since ioctl() request on loop devices is not frequent operation, we don't need fine grained locking. Let's use global lock in order to allow safe traversal at loop_validate_file().
Note that syzbot is also reporting circular locking dependency between bdev->bd_mutex and lo->lo_ctl_mutex [2] which is caused by calling blkdev_reread_part() with lock held. This patch does not address it.
[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a... [2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d1588...
Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot syzbot+bf89c128e05dd6c62523@syzkaller.appspotmail.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 42 +++++++++++++++++++++--------------------- drivers/block/loop.h | 1 - 2 files changed, 21 insertions(+), 22 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -82,6 +82,7 @@
static DEFINE_IDR(loop_index_idr); static DEFINE_MUTEX(loop_index_mutex); +static DEFINE_MUTEX(loop_ctl_mutex);
static int max_part; static int part_shift; @@ -1033,7 +1034,7 @@ static int loop_clr_fd(struct loop_devic */ if (atomic_read(&lo->lo_refcnt) > 1) { lo->lo_flags |= LO_FLAGS_AUTOCLEAR; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return 0; }
@@ -1082,12 +1083,12 @@ static int loop_clr_fd(struct loop_devic if (!part_shift) lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN; loop_unprepare_queue(lo); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); /* - * Need not hold lo_ctl_mutex to fput backing file. - * Calling fput holding lo_ctl_mutex triggers a circular + * Need not hold loop_ctl_mutex to fput backing file. + * Calling fput holding loop_ctl_mutex triggers a circular * lock dependency possibility warning as fput can take - * bd_mutex which is usually taken before lo_ctl_mutex. + * bd_mutex which is usually taken before loop_ctl_mutex. */ fput(filp); return 0; @@ -1350,7 +1351,7 @@ static int lo_ioctl(struct block_device struct loop_device *lo = bdev->bd_disk->private_data; int err;
- mutex_lock_nested(&lo->lo_ctl_mutex, 1); + mutex_lock_nested(&loop_ctl_mutex, 1); switch (cmd) { case LOOP_SET_FD: err = loop_set_fd(lo, mode, bdev, arg); @@ -1359,7 +1360,7 @@ static int lo_ioctl(struct block_device err = loop_change_fd(lo, bdev, arg); break; case LOOP_CLR_FD: - /* loop_clr_fd would have unlocked lo_ctl_mutex on success */ + /* loop_clr_fd would have unlocked loop_ctl_mutex on success */ err = loop_clr_fd(lo); if (!err) goto out_unlocked; @@ -1395,7 +1396,7 @@ static int lo_ioctl(struct block_device default: err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL; } - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex);
out_unlocked: return err; @@ -1528,16 +1529,16 @@ static int lo_compat_ioctl(struct block_
switch(cmd) { case LOOP_SET_STATUS: - mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); err = loop_set_status_compat( lo, (const struct compat_loop_info __user *) arg); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); break; case LOOP_GET_STATUS: - mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); err = loop_get_status_compat( lo, (struct compat_loop_info __user *) arg); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); break; case LOOP_SET_CAPACITY: case LOOP_CLR_FD: @@ -1581,7 +1582,7 @@ static void __lo_release(struct loop_dev if (atomic_dec_return(&lo->lo_refcnt)) return;
- mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { /* * In autoclear mode, stop the loop thread @@ -1598,7 +1599,7 @@ static void __lo_release(struct loop_dev loop_flush(lo); }
- mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); }
static void lo_release(struct gendisk *disk, fmode_t mode) @@ -1644,10 +1645,10 @@ static int unregister_transfer_cb(int id struct loop_device *lo = ptr; struct loop_func_table *xfer = data;
- mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); if (lo->lo_encryption == xfer) loop_release_xfer(lo); - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); return 0; }
@@ -1813,7 +1814,6 @@ static int loop_add(struct loop_device * if (!part_shift) disk->flags |= GENHD_FL_NO_PART_SCAN; disk->flags |= GENHD_FL_EXT_DEVT; - mutex_init(&lo->lo_ctl_mutex); atomic_set(&lo->lo_refcnt, 0); lo->lo_number = i; spin_lock_init(&lo->lo_lock); @@ -1926,19 +1926,19 @@ static long loop_control_ioctl(struct fi ret = loop_lookup(&lo, parm); if (ret < 0) break; - mutex_lock(&lo->lo_ctl_mutex); + mutex_lock(&loop_ctl_mutex); if (lo->lo_state != Lo_unbound) { ret = -EBUSY; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); break; } if (atomic_read(&lo->lo_refcnt) > 0) { ret = -EBUSY; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); break; } lo->lo_disk->private_data = NULL; - mutex_unlock(&lo->lo_ctl_mutex); + mutex_unlock(&loop_ctl_mutex); idr_remove(&loop_index_idr, lo->lo_number); loop_remove(lo); break; --- a/drivers/block/loop.h +++ b/drivers/block/loop.h @@ -55,7 +55,6 @@ struct loop_device {
spinlock_t lo_lock; int lo_state; - struct mutex lo_ctl_mutex; struct kthread_worker worker; struct task_struct *worker_task; bool use_dio;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 967d1dc144b50ad005e5eecdfadfbcfb399ffff6 upstream.
__loop_release() has a single call site. Fold it there. This is currently not a huge win but it will make following replacement of loop_index_mutex more obvious.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1575,12 +1575,15 @@ out: return err; }
-static void __lo_release(struct loop_device *lo) +static void lo_release(struct gendisk *disk, fmode_t mode) { + struct loop_device *lo; int err;
+ mutex_lock(&loop_index_mutex); + lo = disk->private_data; if (atomic_dec_return(&lo->lo_refcnt)) - return; + goto unlock_index;
mutex_lock(&loop_ctl_mutex); if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { @@ -1590,7 +1593,7 @@ static void __lo_release(struct loop_dev */ err = loop_clr_fd(lo); if (!err) - return; + goto unlock_index; } else { /* * Otherwise keep thread (if running) and config, @@ -1600,12 +1603,7 @@ static void __lo_release(struct loop_dev }
mutex_unlock(&loop_ctl_mutex); -} - -static void lo_release(struct gendisk *disk, fmode_t mode) -{ - mutex_lock(&loop_index_mutex); - __lo_release(disk->private_data); +unlock_index: mutex_unlock(&loop_index_mutex); }
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 0a42e99b58a208839626465af194cfe640ef9493 upstream.
Now that loop_ctl_mutex is global, just get rid of loop_index_mutex as there is no good reason to keep these two separate and it just complicates the locking.
Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 39 ++++++++++++++++++++------------------- 1 file changed, 20 insertions(+), 19 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -81,7 +81,6 @@ #include <asm/uaccess.h>
static DEFINE_IDR(loop_index_idr); -static DEFINE_MUTEX(loop_index_mutex); static DEFINE_MUTEX(loop_ctl_mutex);
static int max_part; @@ -1560,9 +1559,11 @@ static int lo_compat_ioctl(struct block_ static int lo_open(struct block_device *bdev, fmode_t mode) { struct loop_device *lo; - int err = 0; + int err;
- mutex_lock(&loop_index_mutex); + err = mutex_lock_killable(&loop_ctl_mutex); + if (err) + return err; lo = bdev->bd_disk->private_data; if (!lo) { err = -ENXIO; @@ -1571,7 +1572,7 @@ static int lo_open(struct block_device *
atomic_inc(&lo->lo_refcnt); out: - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex); return err; }
@@ -1580,12 +1581,11 @@ static void lo_release(struct gendisk *d struct loop_device *lo; int err;
- mutex_lock(&loop_index_mutex); + mutex_lock(&loop_ctl_mutex); lo = disk->private_data; if (atomic_dec_return(&lo->lo_refcnt)) - goto unlock_index; + goto out_unlock;
- mutex_lock(&loop_ctl_mutex); if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) { /* * In autoclear mode, stop the loop thread @@ -1593,7 +1593,7 @@ static void lo_release(struct gendisk *d */ err = loop_clr_fd(lo); if (!err) - goto unlock_index; + return; } else { /* * Otherwise keep thread (if running) and config, @@ -1602,9 +1602,8 @@ static void lo_release(struct gendisk *d loop_flush(lo); }
+out_unlock: mutex_unlock(&loop_ctl_mutex); -unlock_index: - mutex_unlock(&loop_index_mutex); }
static const struct block_device_operations lo_fops = { @@ -1890,7 +1889,7 @@ static struct kobject *loop_probe(dev_t struct kobject *kobj; int err;
- mutex_lock(&loop_index_mutex); + mutex_lock(&loop_ctl_mutex); err = loop_lookup(&lo, MINOR(dev) >> part_shift); if (err < 0) err = loop_add(&lo, MINOR(dev) >> part_shift); @@ -1898,7 +1897,7 @@ static struct kobject *loop_probe(dev_t kobj = NULL; else kobj = get_disk(lo->lo_disk); - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex);
*part = 0; return kobj; @@ -1908,9 +1907,13 @@ static long loop_control_ioctl(struct fi unsigned long parm) { struct loop_device *lo; - int ret = -ENOSYS; + int ret; + + ret = mutex_lock_killable(&loop_ctl_mutex); + if (ret) + return ret;
- mutex_lock(&loop_index_mutex); + ret = -ENOSYS; switch (cmd) { case LOOP_CTL_ADD: ret = loop_lookup(&lo, parm); @@ -1924,7 +1927,6 @@ static long loop_control_ioctl(struct fi ret = loop_lookup(&lo, parm); if (ret < 0) break; - mutex_lock(&loop_ctl_mutex); if (lo->lo_state != Lo_unbound) { ret = -EBUSY; mutex_unlock(&loop_ctl_mutex); @@ -1936,7 +1938,6 @@ static long loop_control_ioctl(struct fi break; } lo->lo_disk->private_data = NULL; - mutex_unlock(&loop_ctl_mutex); idr_remove(&loop_index_idr, lo->lo_number); loop_remove(lo); break; @@ -1946,7 +1947,7 @@ static long loop_control_ioctl(struct fi break; ret = loop_add(&lo, -1); } - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex);
return ret; } @@ -2029,10 +2030,10 @@ static int __init loop_init(void) THIS_MODULE, loop_probe, NULL, NULL);
/* pre-create number of devices given by config or max_loop */ - mutex_lock(&loop_index_mutex); + mutex_lock(&loop_ctl_mutex); for (i = 0; i < nr; i++) loop_add(&lo, i); - mutex_unlock(&loop_index_mutex); + mutex_unlock(&loop_ctl_mutex);
printk(KERN_INFO "loop: module loaded\n"); return 0;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
commit 628bd85947091830a8c4872adfd5ed1d515a9cf2 upstream.
Commit 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") forgot to remove mutex_unlock(&loop_ctl_mutex) from loop_control_ioctl() when replacing loop_index_mutex with loop_ctl_mutex.
Fixes: 0a42e99b58a20883 ("loop: Get rid of loop_index_mutex") Reported-by: syzbot syzbot+c0138741c2290fc5e63f@syzkaller.appspotmail.com Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/loop.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1929,12 +1929,10 @@ static long loop_control_ioctl(struct fi break; if (lo->lo_state != Lo_unbound) { ret = -EBUSY; - mutex_unlock(&loop_ctl_mutex); break; } if (atomic_read(&lo->lo_refcnt) > 0) { ret = -EBUSY; - mutex_unlock(&loop_ctl_mutex); break; } lo->lo_disk->private_data = NULL;
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ivan Mironov mironov.ivan@gmail.com
commit 66a8d5bfb518f9f12d47e1d2dce1732279f9451e upstream.
Strict requirement of pixclock to be zero breaks support of SDL 1.2 which contains hardcoded table of supported video modes with non-zero pixclock values[1].
To better understand which pixclock values are considered valid and how driver should handle these values, I briefly examined few existing fbdev drivers and documentation in Documentation/fb/. And it looks like there are no strict rules on that and actual behaviour varies:
* some drivers treat (pixclock == 0) as "use defaults" (uvesafb.c); * some treat (pixclock == 0) as invalid value which leads to -EINVAL (clps711x-fb.c); * some pass converted pixclock value to hardware (uvesafb.c); * some are trying to find nearest value from predefined table (vga16fb.c, video_gx.c).
Given this, I believe that it should be safe to just ignore this value if changing is not supported. It seems that any portable fbdev application which was not written only for one specific device working under one specific kernel version should not rely on any particular behaviour of pixclock anyway.
However, while enabling SDL1 applications to work out of the box when there is no /etc/fb.modes with valid settings, this change affects the video mode choosing logic in SDL. Depending on current screen resolution, contents of /etc/fb.modes and resolution requested by application, this may lead to user-visible difference (not always): image will be displayed in a right way, but it will be aligned to the left instead of center. There is no "right behaviour" here as well, as emulated fbdev, opposing to old fbdev drivers, simply ignores any requsts of video mode changes with resolutions smaller than current.
The easiest way to reproduce this problem is to install sdl-sopwith[2], remove /etc/fb.modes file if it exists, and then try to run sopwith from console without X. At least in Fedora 29, sopwith may be simply installed from standard repositories.
[1] SDL 1.2.15 source code, src/video/fbcon/SDL_fbvideo.c, vesa_timings [2] http://sdl-sopwith.sourceforge.net/
Signed-off-by: Ivan Mironov mironov.ivan@gmail.com Cc: stable@vger.kernel.org Fixes: 79e539453b34e ("DRM: i915: add mode setting support") Fixes: 771fe6b912fca ("drm/radeon: introduce kernel modesetting for radeon hardware") Fixes: 785b93ef8c309 ("drm/kms: move driver specific fb common code to helper functions (v2)") Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20190108072353.28078-3-mironov... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/drm_fb_helper.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -1238,9 +1238,14 @@ int drm_fb_helper_check_var(struct fb_va struct drm_framebuffer *fb = fb_helper->fb; int depth;
- if (var->pixclock != 0 || in_dbg_master()) + if (in_dbg_master()) return -EINVAL;
+ if (var->pixclock != 0) { + DRM_DEBUG("fbdev emulation doesn't support changing the pixel clock, value of pixclock is ignored\n"); + var->pixclock = 0; + } + /* Need to resize the fb object !!! */ if (var->bits_per_pixel > fb->bits_per_pixel || var->xres > fb->width || var->yres > fb->height ||
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hocko mhocko@suse.com
commit 63f3655f950186752236bb88a22f8252c11ce394 upstream.
Liu Bo has experienced a deadlock between memcg (legacy) reclaim and the ext4 writeback
task1: wait_on_page_bit+0x82/0xa0 shrink_page_list+0x907/0x960 shrink_inactive_list+0x2c7/0x680 shrink_node_memcg+0x404/0x830 shrink_node+0xd8/0x300 do_try_to_free_pages+0x10d/0x330 try_to_free_mem_cgroup_pages+0xd5/0x1b0 try_charge+0x14d/0x720 memcg_kmem_charge_memcg+0x3c/0xa0 memcg_kmem_charge+0x7e/0xd0 __alloc_pages_nodemask+0x178/0x260 alloc_pages_current+0x95/0x140 pte_alloc_one+0x17/0x40 __pte_alloc+0x1e/0x110 alloc_set_pte+0x5fe/0xc20 do_fault+0x103/0x970 handle_mm_fault+0x61e/0xd10 __do_page_fault+0x252/0x4d0 do_page_fault+0x30/0x80 page_fault+0x28/0x30
task2: __lock_page+0x86/0xa0 mpage_prepare_extent_to_map+0x2e7/0x310 [ext4] ext4_writepages+0x479/0xd60 do_writepages+0x1e/0x30 __writeback_single_inode+0x45/0x320 writeback_sb_inodes+0x272/0x600 __writeback_inodes_wb+0x92/0xc0 wb_writeback+0x268/0x300 wb_workfn+0xb4/0x390 process_one_work+0x189/0x420 worker_thread+0x4e/0x4b0 kthread+0xe6/0x100 ret_from_fork+0x41/0x50
He adds "task1 is waiting for the PageWriteback bit of the page that task2 has collected in mpd->io_submit->io_bio, and tasks2 is waiting for the LOCKED bit the page which tasks1 has locked"
More precisely task1 is handling a page fault and it has a page locked while it charges a new page table to a memcg. That in turn hits a memory limit reclaim and the memcg reclaim for legacy controller is waiting on the writeback but that is never going to finish because the writeback itself is waiting for the page locked in the #PF path. So this is essentially ABBA deadlock:
lock_page(A) SetPageWriteback(A) unlock_page(A) lock_page(B) lock_page(B) pte_alloc_pne shrink_page_list wait_on_page_writeback(A) SetPageWriteback(B) unlock_page(B)
# flush A, B to clear the writeback
This accumulating of more pages to flush is used by several filesystems to generate a more optimal IO patterns.
Waiting for the writeback in legacy memcg controller is a workaround for pre-mature OOM killer invocations because there is no dirty IO throttling available for the controller. There is no easy way around that unfortunately. Therefore fix this specific issue by pre-allocating the page table outside of the page lock. We have that handy infrastructure for that already so simply reuse the fault-around pattern which already does this.
There are probably other hidden __GFP_ACCOUNT | GFP_KERNEL allocations from under a fs page locked but they should be really rare. I am not aware of a better solution unfortunately.
[akpm@linux-foundation.org: fix mm/memory.c:__do_fault()] [akpm@linux-foundation.org: coding-style fixes] [mhocko@kernel.org: enhance comment, per Johannes] Link: http://lkml.kernel.org/r/20181214084948.GA5624@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org Fixes: c3b94f44fcb0 ("memcg: further prevent OOM with too many dirty pages") Signed-off-by: Michal Hocko mhocko@suse.com Reported-by: Liu Bo bo.liu@linux.alibaba.com Debugged-by: Liu Bo bo.liu@linux.alibaba.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Acked-by: Johannes Weiner hannes@cmpxchg.org Reviewed-by: Liu Bo bo.liu@linux.alibaba.com Cc: Jan Kara jack@suse.cz Cc: Dave Chinner david@fromorbit.com Cc: Theodore Ts'o tytso@mit.edu Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: Shakeel Butt shakeelb@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/memory.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
--- a/mm/memory.c +++ b/mm/memory.c @@ -2823,6 +2823,28 @@ static int __do_fault(struct fault_env * struct vm_fault vmf; int ret;
+ /* + * Preallocate pte before we take page_lock because this might lead to + * deadlocks for memcg reclaim which waits for pages under writeback: + * lock_page(A) + * SetPageWriteback(A) + * unlock_page(A) + * lock_page(B) + * lock_page(B) + * pte_alloc_pne + * shrink_page_list + * wait_on_page_writeback(A) + * SetPageWriteback(B) + * unlock_page(B) + * # flush A, B to clear the writeback + */ + if (pmd_none(*fe->pmd) && !fe->prealloc_pte) { + fe->prealloc_pte = pte_alloc_one(vma->vm_mm, fe->address); + if (!fe->prealloc_pte) + return VM_FAULT_OOM; + smp_wmb(); /* See comment in __pte_alloc() */ + } + vmf.virtual_address = (void __user *)(fe->address & PAGE_MASK); vmf.pgoff = pgoff; vmf.flags = fe->flags;
On Mon, 21 Jan 2019 at 19:28, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.9.152 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 12:24:02 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.152-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.9.152-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.9.y git commit: 5e5ea6306dfd1c05e5a09b0c6859511f9c7a2c5c git describe: v4.9.151-55-g5e5ea6306dfd Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.151-55-...
No regressions (compared to build v4.9.151)
No fixes (compared to build v4.9.151)
Ran 21456 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * boot * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-timers-tests * spectre-meltdown-checker-test * ltp-open-posix-tests * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Mon, Jan 21, 2019 at 02:43:56PM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.152 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 12:24:02 UTC 2019. Anything received after that time might be too late.
Build results: total: 172 pass: 172 fail: 0 Qemu test results: total: 315 pass: 315 fail: 0
Guenter
On 1/21/19 6:43 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.152 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 12:24:02 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.152-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On 21/01/2019 13:43, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.9.152 release. There are 51 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed Jan 23 12:24:02 UTC 2019. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.152-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.9: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 14 tests: 14 pass, 0 fail
Linux version: 4.9.152-rc1-g5e5ea63 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
linux-stable-mirror@lists.linaro.org