This is the start of the stable review cycle for the 4.4.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Aug 2020 08:23:34 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.234-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.234-rc1
Adam Ford aford173@gmail.com omapfb: dss: Fix max fclk divider for omap36xx
Juergen Gross jgross@suse.com xen: don't reschedule in preemption off sections
Peter Xu peterx@redhat.com mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
Al Viro viro@zeniv.linux.org.uk do_epoll_ctl(): clean the failure exits up a bit
Marc Zyngier maz@kernel.org epoll: Keep a reference on files added to the check list
Michael Ellerman mpe@ellerman.id.au powerpc: Allow 4224 bytes of stack expansion for the signal frame
Dinghao Liu dinghao.liu@zju.edu.cn ASoC: intel: Fix memleak in sst_media_open
Eric Sandeen sandeen@redhat.com ext4: fix potential negative array index in do_split()
Luc Van Oostenryck luc.vanoostenryck@gmail.com alpha: fix annotation of io{read,write}{16,32}be()
Eiichi Tsukata devel@etsukata.com xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init
Mao Wenan wenan.mao@linux.alibaba.com virtio_ring: Avoid loop when vq is broken in virtqueue_poll
Javed Hasan jhasan@marvell.com scsi: libfc: Free skb in fc_disc_gpn_id_resp() for valid cases
Zhe Li lizhe67@huawei.com jffs2: fix UAF problem
Darrick J. Wong darrick.wong@oracle.com xfs: fix inode quota reservation checks
Greg Ungerer gerg@linux-m68k.org m68knommu: fix overwriting of bits in ColdFire V3 cache control
Xiongfeng Wang wangxiongfeng2@huawei.com Input: psmouse - add a newline when printing 'proto' by sysfs
Evgeny Novikov novikov@ispras.ru media: vpss: clean up resources in init
Chuhong Yuan hslester96@gmail.com media: budget-core: Improve exception handling in budget_register()
Jan Kara jack@suse.cz ext4: fix checking of directory entry validity for inline directories
Eric Biggers ebiggers@google.com ext4: clean up ext4_match() and callers
Charan Teja Reddy charante@codeaurora.org mm, page_alloc: fix core hung in free_pcppages_bulk()
Doug Berger opendmb@gmail.com mm: include CMA pages in lowmem_reserve at boot
Jann Horn jannh@google.com romfs: fix uninitialized memory leak in romfs_dev_read()
Josef Bacik josef@toxicpanda.com btrfs: don't show full path of bind mounts in subvol=
Marcos Paulo de Souza mpdesouza@suse.com btrfs: export helpers for subvolume name/id resolution
Hugh Dickins hughd@google.com khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
Hugh Dickins hughd@google.com khugepaged: khugepaged_test_exit() check mmget_still_valid()
Andrea Arcangeli aarcange@redhat.com coredump: fix race condition between collapse_huge_page() and core dumping
Ahmad Fatoum a.fatoum@pengutronix.de watchdog: f71808e_wdt: remove use of wrong watchdog_info option
Ahmad Fatoum a.fatoum@pengutronix.de watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
Kees Cook keescook@chromium.org net/compat: Add missing sock updates for SCM_RIGHTS
Masami Hiramatsu mhiramat@kernel.org perf probe: Fix memory leakage when the probe point is not found
Liu Ying victor.liu@nxp.com drm/imx: imx-ldb: Disable both channels for split mode in enc->disable()
-------------
Diffstat:
Makefile | 4 +- arch/alpha/include/asm/io.h | 8 +-- arch/m68k/include/asm/m53xxacr.h | 6 +- arch/powerpc/mm/fault.c | 7 +- drivers/gpu/drm/imx/imx-ldb.c | 7 +- drivers/input/mouse/psmouse-base.c | 2 +- drivers/media/pci/ttpci/budget-core.c | 11 +++- drivers/media/platform/davinci/vpss.c | 20 ++++-- drivers/scsi/libfc/fc_disc.c | 12 +++- drivers/video/fbdev/omap2/dss/dss.c | 2 +- drivers/virtio/virtio_ring.c | 3 + drivers/watchdog/f71808e_wdt.c | 6 +- drivers/xen/preempt.c | 2 +- fs/btrfs/ctree.h | 2 + fs/btrfs/export.c | 8 +-- fs/btrfs/export.h | 5 ++ fs/btrfs/super.c | 18 +++-- fs/eventpoll.c | 19 +++--- fs/ext4/namei.c | 99 +++++++++++----------------- fs/jffs2/dir.c | 6 +- fs/romfs/storage.c | 4 +- fs/xfs/xfs_sysfs.h | 6 +- fs/xfs/xfs_trans_dquot.c | 2 +- include/linux/mm.h | 4 ++ include/net/sock.h | 4 ++ mm/huge_memory.c | 4 +- mm/hugetlb.c | 25 ++++--- mm/page_alloc.c | 7 +- net/compat.c | 1 + net/core/sock.c | 21 ++++++ sound/soc/intel/atom/sst-mfld-platform-pcm.c | 5 +- tools/perf/util/probe-finder.c | 2 +- 32 files changed, 197 insertions(+), 135 deletions(-)
From: Liu Ying victor.liu@nxp.com
[ Upstream commit 3b2a999582c467d1883716b37ffcc00178a13713 ]
Both of the two LVDS channels should be disabled for split mode in the encoder's ->disable() callback, because they are enabled in the encoder's ->enable() callback.
Fixes: 6556f7f82b9c ("drm: imx: Move imx-drm driver out of staging") Cc: Philipp Zabel p.zabel@pengutronix.de Cc: Sascha Hauer s.hauer@pengutronix.de Cc: Pengutronix Kernel Team kernel@pengutronix.de Cc: NXP Linux Team linux-imx@nxp.com Cc: stable@vger.kernel.org Signed-off-by: Liu Ying victor.liu@nxp.com Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/imx/imx-ldb.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c index 31ca56e593f58..b9dc2ef64ed88 100644 --- a/drivers/gpu/drm/imx/imx-ldb.c +++ b/drivers/gpu/drm/imx/imx-ldb.c @@ -305,6 +305,7 @@ static void imx_ldb_encoder_disable(struct drm_encoder *encoder) { struct imx_ldb_channel *imx_ldb_ch = enc_to_imx_ldb_ch(encoder); struct imx_ldb *ldb = imx_ldb_ch->ldb; + int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN; int mux, ret;
/* @@ -321,14 +322,14 @@ static void imx_ldb_encoder_disable(struct drm_encoder *encoder)
drm_panel_disable(imx_ldb_ch->panel);
- if (imx_ldb_ch == &ldb->channel[0]) + if (imx_ldb_ch == &ldb->channel[0] || dual) ldb->ldb_ctrl &= ~LDB_CH0_MODE_EN_MASK; - else if (imx_ldb_ch == &ldb->channel[1]) + if (imx_ldb_ch == &ldb->channel[1] || dual) ldb->ldb_ctrl &= ~LDB_CH1_MODE_EN_MASK;
regmap_write(ldb->regmap, IOMUXC_GPR2, ldb->ldb_ctrl);
- if (ldb->ldb_ctrl & LDB_SPLIT_MODE_EN) { + if (dual) { clk_disable_unprepare(ldb->clk[0]); clk_disable_unprepare(ldb->clk[1]); }
From: Masami Hiramatsu mhiramat@kernel.org
[ Upstream commit 12d572e785b15bc764e956caaa8a4c846fd15694 ]
Fix the memory leakage in debuginfo__find_trace_events() when the probe point is not found in the debuginfo. If there is no probe point found in the debuginfo, debuginfo__find_probes() will NOT return -ENOENT, but 0.
Thus the caller of debuginfo__find_probes() must check the tf.ntevs and release the allocated memory for the array of struct probe_trace_event.
The current code releases the memory only if the debuginfo__find_probes() hits an error but not checks tf.ntevs. In the result, the memory allocated on *tevs are not released if tf.ntevs == 0.
This fixes the memory leakage by checking tf.ntevs == 0 in addition to ret < 0.
Fixes: ff741783506c ("perf probe: Introduce debuginfo to encapsulate dwarf information") Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Reviewed-by: Srikar Dronamraju srikar@linux.vnet.ibm.com Cc: Andi Kleen ak@linux.intel.com Cc: Oleg Nesterov oleg@redhat.com Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/159438668346.62703.10887420400718492503.stgit@de... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/probe-finder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index c694f10d004cc..1b73537af91db 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -1274,7 +1274,7 @@ int debuginfo__find_trace_events(struct debuginfo *dbg, tf.ntevs = 0;
ret = debuginfo__find_probes(dbg, &tf.pf); - if (ret < 0) { + if (ret < 0 || tf.ntevs == 0) { for (i = 0; i < tf.ntevs; i++) clear_probe_trace_event(&tf.tevs[i]); zfree(tevs);
From: Kees Cook keescook@chromium.org
[ Upstream commit d9539752d23283db4692384a634034f451261e29 ]
Add missed sock updates to compat path via a new helper, which will be used more in coming patches. (The net/core/scm.c code is left as-is here to assist with -stable backports for the compat path.)
Cc: Christoph Hellwig hch@lst.de Cc: Sargun Dhillon sargun@sargun.me Cc: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly") Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly") Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 4 ++++ net/compat.c | 1 + net/core/sock.c | 21 +++++++++++++++++++++ 3 files changed, 26 insertions(+)
diff --git a/include/net/sock.h b/include/net/sock.h index 426a57874964c..31198b32d9122 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -779,6 +779,8 @@ static inline int sk_memalloc_socks(void) { return static_key_false(&memalloc_socks); } + +void __receive_sock(struct file *file); #else
static inline int sk_memalloc_socks(void) @@ -786,6 +788,8 @@ static inline int sk_memalloc_socks(void) return 0; }
+static inline void __receive_sock(struct file *file) +{ } #endif
static inline gfp_t sk_gfp_atomic(const struct sock *sk, gfp_t gfp_mask) diff --git a/net/compat.c b/net/compat.c index d676840104556..20c5e5f215f23 100644 --- a/net/compat.c +++ b/net/compat.c @@ -284,6 +284,7 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm) break; } /* Bump the usage count and install the file. */ + __receive_sock(fp[i]); fd_install(new_fd, get_file(fp[i])); }
diff --git a/net/core/sock.c b/net/core/sock.c index 120d5058d81ae..82f9a7dbea6fe 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2275,6 +2275,27 @@ int sock_no_mmap(struct file *file, struct socket *sock, struct vm_area_struct * } EXPORT_SYMBOL(sock_no_mmap);
+/* + * When a file is received (via SCM_RIGHTS, etc), we must bump the + * various sock-based usage counts. + */ +void __receive_sock(struct file *file) +{ + struct socket *sock; + int error; + + /* + * The resulting value of "error" is ignored here since we only + * need to take action when the file is a socket and testing + * "sock" for NULL is sufficient. + */ + sock = sock_from_file(file, &error); + if (sock) { + sock_update_netprioidx(sock->sk); + sock_update_classid(sock->sk); + } +} + ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { ssize_t res;
From: Ahmad Fatoum a.fatoum@pengutronix.de
[ Upstream commit e871e93fb08a619dfc015974a05768ed6880fd82 ]
The driver supports populating bootstatus with WDIOF_CARDRESET, but so far userspace couldn't portably determine whether absence of this flag meant no watchdog reset or no driver support. Or-in the bit to fix this.
Fixes: b97cb21a4634 ("watchdog: f71808e_wdt: Fix WDTMOUT_STS register read") Cc: stable@vger.kernel.org Signed-off-by: Ahmad Fatoum a.fatoum@pengutronix.de Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20200611191750.28096-3-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/f71808e_wdt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c index 2048aad91add8..3577e356e08cc 100644 --- a/drivers/watchdog/f71808e_wdt.c +++ b/drivers/watchdog/f71808e_wdt.c @@ -644,7 +644,8 @@ static int __init watchdog_init(int sioaddr) watchdog.sioaddr = sioaddr; watchdog.ident.options = WDIOC_SETTIMEOUT | WDIOF_MAGICCLOSE - | WDIOF_KEEPALIVEPING; + | WDIOF_KEEPALIVEPING + | WDIOF_CARDRESET;
snprintf(watchdog.ident.identity, sizeof(watchdog.ident.identity), "%s watchdog",
From: Ahmad Fatoum a.fatoum@pengutronix.de
[ Upstream commit 802141462d844f2e6a4d63a12260d79b7afc4c34 ]
The flags that should be or-ed into the watchdog_info.options by drivers all start with WDIOF_, e.g. WDIOF_SETTIMEOUT, which indicates that the driver's watchdog_ops has a usable set_timeout.
WDIOC_SETTIMEOUT was used instead, which expands to 0xc0045706, which equals:
WDIOF_FANFAULT | WDIOF_EXTERN1 | WDIOF_PRETIMEOUT | WDIOF_ALARMONLY | WDIOF_MAGICCLOSE | 0xc0045000
These were so far indicated to userspace on WDIOC_GETSUPPORT. As the driver has not yet been migrated to the new watchdog kernel API, the constant can just be dropped without substitute.
Fixes: 96cb4eb019ce ("watchdog: f71808e_wdt: new watchdog driver for Fintek F71808E and F71882FG") Cc: stable@vger.kernel.org Signed-off-by: Ahmad Fatoum a.fatoum@pengutronix.de Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/20200611191750.28096-4-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/watchdog/f71808e_wdt.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/watchdog/f71808e_wdt.c b/drivers/watchdog/f71808e_wdt.c index 3577e356e08cc..2b12ef019ae02 100644 --- a/drivers/watchdog/f71808e_wdt.c +++ b/drivers/watchdog/f71808e_wdt.c @@ -642,8 +642,7 @@ static int __init watchdog_init(int sioaddr) * into the module have been registered yet. */ watchdog.sioaddr = sioaddr; - watchdog.ident.options = WDIOC_SETTIMEOUT - | WDIOF_MAGICCLOSE + watchdog.ident.options = WDIOF_MAGICCLOSE | WDIOF_KEEPALIVEPING | WDIOF_CARDRESET;
From: Andrea Arcangeli aarcange@redhat.com
[ Upstream commit 59ea6d06cfa9247b586a695c21f94afa7183af74 ]
When fixing the race conditions between the coredump and the mmap_sem holders outside the context of the process, we focused on mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping"), but those aren't the only cases where the mmap_sem can be taken outside of the context of the process as Michal Hocko noticed while backporting that commit to older -stable kernels.
If mmgrab() is called in the context of the process, but then the mm_count reference is transferred outside the context of the process, that can also be a problem if the mmap_sem has to be taken for writing through that mm_count reference.
khugepaged registration calls mmgrab() in the context of the process, but the mmap_sem for writing is taken later in the context of the khugepaged kernel thread.
collapse_huge_page() after taking the mmap_sem for writing doesn't modify any vma, so it's not obvious that it could cause a problem to the coredump, but it happens to modify the pmd in a way that breaks an invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page() needs the mmap_sem for writing just to block concurrent page faults that call pmd_trans_huge_lock().
Specifically the invariant that "!pmd_trans_huge()" cannot become a "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
The coredump will call __get_user_pages() without mmap_sem for reading, which eventually can invoke a lockless page fault which will need a functional pmd_trans_huge_lock().
So collapse_huge_page() needs to use mmget_still_valid() to check it's not running concurrently with the coredump... as long as the coredump can invoke page faults without holding the mmap_sem for reading.
This has "Fixes: khugepaged" to facilitate backporting, but in my view it's more a bug in the coredump code that will eventually have to be rewritten to stop invoking page faults without the mmap_sem for reading. So the long term plan is still to drop all mmget_still_valid().
Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Andrea Arcangeli aarcange@redhat.com Reported-by: Michal Hocko mhocko@suse.com Acked-by: Michal Hocko mhocko@suse.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Oleg Nesterov oleg@redhat.com Cc: Jann Horn jannh@google.com Cc: Hugh Dickins hughd@google.com Cc: Mike Rapoport rppt@linux.vnet.ibm.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Peter Xu peterx@redhat.com Cc: Jason Gunthorpe jgg@mellanox.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mm.h | 4 ++++ mm/huge_memory.c | 3 +++ 2 files changed, 7 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h index 03cf5526e4456..2b17d2fca4299 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1123,6 +1123,10 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma, * followed by taking the mmap_sem for writing before modifying the * vmas or anything the coredump pretends not to change from under it. * + * It also has to be called when mmgrab() is used in the context of + * the process, but then the mm_count refcount is transferred outside + * the context of the process to run down_write() on that pinned mm. + * * NOTE: find_extend_vma() called from GUP context is the only place * that can modify the "mm" (notably the vm_start/end) under mmap_sem * for reading and outside the context of the process, so it is also diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 465786cd6490e..c5628ebc0fc29 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2587,6 +2587,9 @@ static void collapse_huge_page(struct mm_struct *mm, * handled by the anon_vma lock + PG_lock. */ down_write(&mm->mmap_sem); + result = SCAN_ANY_PROCESS; + if (!mmget_still_valid(mm)) + goto out; if (unlikely(khugepaged_test_exit(mm))) goto out;
From: Hugh Dickins hughd@google.com
[ Upstream commit bbe98f9cadff58cdd6a4acaeba0efa8565dabe65 ]
Move collapse_huge_page()'s mmget_still_valid() check into khugepaged_test_exit() itself. collapse_huge_page() is used for anon THP only, and earned its mmget_still_valid() check because it inserts a huge pmd entry in place of the page table's pmd entry; whereas collapse_file()'s retract_page_tables() or collapse_pte_mapped_thp() merely clears the page table's pmd entry. But core dumping without mmap lock must have been as open to mistaking a racily cleared pmd entry for a page table at physical page 0, as exit_mmap() was. And we certainly have no interest in mapping as a THP once dumping core.
Fixes: 59ea6d06cfa9 ("coredump: fix race condition between collapse_huge_page() and core dumping") Signed-off-by: Hugh Dickins hughd@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Song Liu songliubraving@fb.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: stable@vger.kernel.org [4.8+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021217020.27773@eggly.anvils Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/huge_memory.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c5628ebc0fc29..1c4d7d2f53d22 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2136,7 +2136,7 @@ static void insert_to_mm_slots_hash(struct mm_struct *mm,
static inline int khugepaged_test_exit(struct mm_struct *mm) { - return atomic_read(&mm->mm_users) == 0; + return atomic_read(&mm->mm_users) == 0 || !mmget_still_valid(mm); }
int __khugepaged_enter(struct mm_struct *mm) @@ -2587,9 +2587,6 @@ static void collapse_huge_page(struct mm_struct *mm, * handled by the anon_vma lock + PG_lock. */ down_write(&mm->mmap_sem); - result = SCAN_ANY_PROCESS; - if (!mmget_still_valid(mm)) - goto out; if (unlikely(khugepaged_test_exit(mm))) goto out;
From: Hugh Dickins hughd@google.com
[ Upstream commit f3f99d63a8156c7a4a6b20aac22b53c5579c7dc1 ]
syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in __khugepaged_enter(): yes, when one thread is about to dump core, has set core_state, and is waiting for others, another might do something calling __khugepaged_enter(), which now crashes because I lumped the core_state test (known as "mmget_still_valid") into khugepaged_test_exit(). I still think it's best to lump them together, so just in this exceptional case, check mm->mm_users directly instead of khugepaged_test_exit().
Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Hugh Dickins hughd@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Yang Shi shy828301@gmail.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Song Liu songliubraving@fb.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Eric Dumazet edumazet@google.com Cc: stable@vger.kernel.org [4.8+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1c4d7d2f53d22..f38d24bb8a1bc 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2149,7 +2149,7 @@ int __khugepaged_enter(struct mm_struct *mm) return -ENOMEM;
/* __khugepaged_exit() must not run from under us */ - VM_BUG_ON_MM(khugepaged_test_exit(mm), mm); + VM_BUG_ON_MM(atomic_read(&mm->mm_users) == 0, mm); if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) { free_mm_slot(mm_slot); return 0;
From: Marcos Paulo de Souza mpdesouza@suse.com
[ Upstream commit c0c907a47dccf2cf26251a8fb4a8e7a3bf79ce84 ]
The functions will be used outside of export.c and super.c to allow resolving subvolume name from a given id, eg. for subvolume deletion by id ioctl.
Signed-off-by: Marcos Paulo de Souza mpdesouza@suse.com Reviewed-by: David Sterba dsterba@suse.com [ split from the next patch ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.h | 2 ++ fs/btrfs/export.c | 8 ++++---- fs/btrfs/export.h | 5 +++++ fs/btrfs/super.c | 8 ++++---- 4 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 0b06d4942da77..8fb9a1e0048be 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -4096,6 +4096,8 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size); /* super.c */ int btrfs_parse_options(struct btrfs_root *root, char *options); int btrfs_sync_fs(struct super_block *sb, int wait); +char *btrfs_get_subvol_name_from_objectid(struct btrfs_fs_info *fs_info, + u64 subvol_objectid);
#ifdef CONFIG_PRINTK __printf(2, 3) diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 2513a7f533342..92f80ed642194 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -55,9 +55,9 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, return type; }
-static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid, - u64 root_objectid, u32 generation, - int check_generation) +struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid, + u64 root_objectid, u32 generation, + int check_generation) { struct btrfs_fs_info *fs_info = btrfs_sb(sb); struct btrfs_root *root; @@ -150,7 +150,7 @@ static struct dentry *btrfs_fh_to_dentry(struct super_block *sb, struct fid *fh, return btrfs_get_dentry(sb, objectid, root_objectid, generation, 1); }
-static struct dentry *btrfs_get_parent(struct dentry *child) +struct dentry *btrfs_get_parent(struct dentry *child) { struct inode *dir = d_inode(child); struct btrfs_root *root = BTRFS_I(dir)->root; diff --git a/fs/btrfs/export.h b/fs/btrfs/export.h index 074348a95841f..7a305e5549991 100644 --- a/fs/btrfs/export.h +++ b/fs/btrfs/export.h @@ -16,4 +16,9 @@ struct btrfs_fid { u64 parent_root_objectid; } __attribute__ ((packed));
+struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid, + u64 root_objectid, u32 generation, + int check_generation); +struct dentry *btrfs_get_parent(struct dentry *child); + #endif diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 404051bf5cba9..540e6f141745a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -843,8 +843,8 @@ out: return error; }
-static char *get_subvol_name_from_objectid(struct btrfs_fs_info *fs_info, - u64 subvol_objectid) +char *btrfs_get_subvol_name_from_objectid(struct btrfs_fs_info *fs_info, + u64 subvol_objectid) { struct btrfs_root *root = fs_info->tree_root; struct btrfs_root *fs_root; @@ -1323,8 +1323,8 @@ static struct dentry *mount_subvol(const char *subvol_name, u64 subvol_objectid, goto out; } } - subvol_name = get_subvol_name_from_objectid(btrfs_sb(mnt->mnt_sb), - subvol_objectid); + subvol_name = btrfs_get_subvol_name_from_objectid( + btrfs_sb(mnt->mnt_sb), subvol_objectid); if (IS_ERR(subvol_name)) { root = ERR_CAST(subvol_name); subvol_name = NULL;
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 3ef3959b29c4a5bd65526ab310a1a18ae533172a ]
Chris Murphy reported a problem where rpm ostree will bind mount a bunch of things for whatever voodoo it's doing. But when it does this /proc/mounts shows something like
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0 /dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo/bar 0 0
Despite subvolid=256 being subvol=/foo. This is because we're just spitting out the dentry of the mount point, which in the case of bind mounts is the source path for the mountpoint. Instead we should spit out the path to the actual subvol. Fix this by looking up the name for the subvolid we have mounted. With this fix the same test looks like this
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0 /dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
Reported-by: Chris Murphy chris@colorremedies.com CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/super.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 540e6f141745a..77e6ce0e1e351 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1120,6 +1120,7 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) struct btrfs_fs_info *info = btrfs_sb(dentry->d_sb); struct btrfs_root *root = info->tree_root; char *compress_type; + const char *subvol_name;
if (btrfs_test_opt(root, DEGRADED)) seq_puts(seq, ",degraded"); @@ -1204,8 +1205,13 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry) #endif seq_printf(seq, ",subvolid=%llu", BTRFS_I(d_inode(dentry))->root->root_key.objectid); - seq_puts(seq, ",subvol="); - seq_dentry(seq, dentry, " \t\n\"); + subvol_name = btrfs_get_subvol_name_from_objectid(info, + BTRFS_I(d_inode(dentry))->root->root_key.objectid); + if (!IS_ERR(subvol_name)) { + seq_puts(seq, ",subvol="); + seq_escape(seq, subvol_name, " \t\n\"); + kfree(subvol_name); + } return 0; }
From: Jann Horn jannh@google.com
commit bcf85fcedfdd17911982a3e3564fcfec7b01eebd upstream.
romfs has a superblock field that limits the size of the filesystem; data beyond that limit is never accessed.
romfs_dev_read() fetches a caller-supplied number of bytes from the backing device. It returns 0 on success or an error code on failure; therefore, its API can't represent short reads, it's all-or-nothing.
However, when romfs_dev_read() detects that the requested operation would cross the filesystem size limit, it currently silently truncates the requested number of bytes. This e.g. means that when the content of a file with size 0x1000 starts one byte before the filesystem size limit, ->readpage() will only fill a single byte of the supplied page while leaving the rest uninitialized, leaking that uninitialized memory to userspace.
Fix it by returning an error code instead of truncating the read when the requested read operation would go beyond the end of the filesystem.
Fixes: da4458bda237 ("NOMMU: Make it possible for RomFS to use MTD devices directly") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: David Howells dhowells@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/romfs/storage.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/romfs/storage.c +++ b/fs/romfs/storage.c @@ -221,10 +221,8 @@ int romfs_dev_read(struct super_block *s size_t limit;
limit = romfs_maxsize(sb); - if (pos >= limit) + if (pos >= limit || buflen > limit - pos) return -EIO; - if (buflen > limit - pos) - buflen = limit - pos;
#ifdef CONFIG_ROMFS_ON_MTD if (sb->s_mtd)
From: Doug Berger opendmb@gmail.com
commit e08d3fdfe2dafa0331843f70ce1ff6c1c4900bf4 upstream.
The lowmem_reserve arrays provide a means of applying pressure against allocations from lower zones that were targeted at higher zones. Its values are a function of the number of pages managed by higher zones and are assigned by a call to the setup_per_zone_lowmem_reserve() function.
The function is initially called at boot time by the function init_per_zone_wmark_min() and may be called later by accesses of the /proc/sys/vm/lowmem_reserve_ratio sysctl file.
The function init_per_zone_wmark_min() was moved up from a module_init to a core_initcall to resolve a sequencing issue with khugepaged. Unfortunately this created a sequencing issue with CMA page accounting.
The CMA pages are added to the managed page count of a zone when cma_init_reserved_areas() is called at boot also as a core_initcall. This makes it uncertain whether the CMA pages will be added to the managed page counts of their zones before or after the call to init_per_zone_wmark_min() as it becomes dependent on link order. With the current link order the pages are added to the managed count after the lowmem_reserve arrays are initialized at boot.
This means the lowmem_reserve values at boot may be lower than the values used later if /proc/sys/vm/lowmem_reserve_ratio is accessed even if the ratio values are unchanged.
In many cases the difference is not significant, but for example an ARM platform with 1GB of memory and the following memory layout
cma: Reserved 256 MiB at 0x0000000030000000 Zone ranges: DMA [mem 0x0000000000000000-0x000000002fffffff] Normal empty HighMem [mem 0x0000000030000000-0x000000003fffffff]
would result in 0 lowmem_reserve for the DMA zone. This would allow userspace to deplete the DMA zone easily.
Funnily enough
$ cat /proc/sys/vm/lowmem_reserve_ratio
would fix up the situation because as a side effect it forces setup_per_zone_lowmem_reserve.
This commit breaks the link order dependency by invoking init_per_zone_wmark_min() as a postcore_initcall so that the CMA pages have the chance to be properly accounted in their zone(s) and allowing the lowmem_reserve arrays to receive consistent values.
Fixes: bc22af74f271 ("mm: update min_free_kbytes from khugepaged after core initialization") Signed-off-by: Doug Berger opendmb@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: Michal Hocko mhocko@suse.com Cc: Jason Baron jbaron@akamai.com Cc: David Rientjes rientjes@google.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1597423766-27849-1-git-send-email-opendmb@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6285,7 +6285,7 @@ int __meminit init_per_zone_wmark_min(vo setup_per_zone_inactive_ratio(); return 0; } -core_initcall(init_per_zone_wmark_min) +postcore_initcall(init_per_zone_wmark_min)
/* * min_free_kbytes_sysctl_handler - just a wrapper around proc_dointvec() so
From: Charan Teja Reddy charante@codeaurora.org
commit 88e8ac11d2ea3acc003cf01bb5a38c8aa76c3cfd upstream.
The following race is observed with the repeated online, offline and a delay between two successive online of memory blocks of movable zone.
P1 P2
Online the first memory block in the movable zone. The pcp struct values are initialized to default values,i.e., pcp->high = 0 & pcp->batch = 1.
Allocate the pages from the movable zone.
Try to Online the second memory block in the movable zone thus it entered the online_pages() but yet to call zone_pcp_update(). This process is entered into the exit path thus it tries to release the order-0 pages to pcp lists through free_unref_page_commit(). As pcp->high = 0, pcp->count = 1 proceed to call the function free_pcppages_bulk(). Update the pcp values thus the new pcp values are like, say, pcp->high = 378, pcp->batch = 63. Read the pcp's batch value using READ_ONCE() and pass the same to free_pcppages_bulk(), pcp values passed here are, batch = 63, count = 1.
Since num of pages in the pcp lists are less than ->batch, then it will stuck in while(list_empty(list)) loop with interrupts disabled thus a core hung.
Avoid this by ensuring free_pcppages_bulk() is called with proper count of pcp list pages.
The mentioned race is some what easily reproducible without [1] because pcp's are not updated for the first memory block online and thus there is a enough race window for P2 between alloc+free and pcp struct values update through onlining of second memory block.
With [1], the race still exists but it is very narrow as we update the pcp struct values for the first memory block online itself.
This is not limited to the movable zone, it could also happen in cases with the normal zone (e.g., hotplug to a node that only has DMA memory, or no other memory yet).
[1]: https://patchwork.kernel.org/patch/11696389/
Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type") Signed-off-by: Charan Teja Reddy charante@codeaurora.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Acked-by: David Hildenbrand david@redhat.com Acked-by: David Rientjes rientjes@google.com Acked-by: Michal Hocko mhocko@suse.com Cc: Michal Hocko mhocko@suse.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Vinayak Menon vinmenon@codeaurora.org Cc: stable@vger.kernel.org [2.6+] Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeauro... Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- mm/page_alloc.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -843,6 +843,11 @@ static void free_pcppages_bulk(struct zo if (nr_scanned) __mod_zone_page_state(zone, NR_PAGES_SCANNED, -nr_scanned);
+ /* + * Ensure proper count is passed which otherwise would stuck in the + * below while (list_empty(list)) loop. + */ + count = min(pcp->count, count); while (to_free) { struct page *page; struct list_head *list;
From: Eric Biggers ebiggers@google.com
[ Upstream commit d9b9f8d5a88cb7881d9f1c2b7e9de9a3fe1dc9e2 ]
When ext4 encryption was originally merged, we were encrypting the user-specified filename in ext4_match(), introducing a lot of additional complexity into ext4_match() and its callers. This has since been changed to encrypt the filename earlier, so we can remove the gunk that's no longer needed. This more or less reverts ext4_search_dir() and ext4_find_dest_de() to the way they were in the v4.0 kernel.
Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/namei.c | 81 +++++++++++++++---------------------------------- 1 file changed, 25 insertions(+), 56 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index 566a8b08ccdd6..eb41fb189d4b3 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1226,19 +1226,18 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block) }
/* - * NOTE! unlike strncmp, ext4_match returns 1 for success, 0 for failure. + * Test whether a directory entry matches the filename being searched for. * - * `len <= EXT4_NAME_LEN' is guaranteed by caller. - * `de != NULL' is guaranteed by caller. + * Return: %true if the directory entry matches, otherwise %false. */ -static inline int ext4_match(struct ext4_filename *fname, - struct ext4_dir_entry_2 *de) +static inline bool ext4_match(const struct ext4_filename *fname, + const struct ext4_dir_entry_2 *de) { const void *name = fname_name(fname); u32 len = fname_len(fname);
if (!de->inode) - return 0; + return false;
#ifdef CONFIG_EXT4_FS_ENCRYPTION if (unlikely(!name)) { @@ -1270,48 +1269,31 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size, struct ext4_dir_entry_2 * de; char * dlimit; int de_len; - int res;
de = (struct ext4_dir_entry_2 *)search_buf; dlimit = search_buf + buf_size; while ((char *) de < dlimit) { /* this code is executed quadratically often */ /* do minimal checking `by hand' */ - if ((char *) de + de->name_len <= dlimit) { - res = ext4_match(fname, de); - if (res < 0) { - res = -1; - goto return_result; - } - if (res > 0) { - /* found a match - just to be sure, do - * a full check */ - if (ext4_check_dir_entry(dir, NULL, de, bh, - bh->b_data, - bh->b_size, offset)) { - res = -1; - goto return_result; - } - *res_dir = de; - res = 1; - goto return_result; - } - + if ((char *) de + de->name_len <= dlimit && + ext4_match(fname, de)) { + /* found a match - just to be sure, do + * a full check */ + if (ext4_check_dir_entry(dir, NULL, de, bh, bh->b_data, + bh->b_size, offset)) + return -1; + *res_dir = de; + return 1; } /* prevent looping on a bad block */ de_len = ext4_rec_len_from_disk(de->rec_len, dir->i_sb->s_blocksize); - if (de_len <= 0) { - res = -1; - goto return_result; - } + if (de_len <= 0) + return -1; offset += de_len; de = (struct ext4_dir_entry_2 *) ((char *) de + de_len); } - - res = 0; -return_result: - return res; + return 0; }
static int is_dx_internal_node(struct inode *dir, ext4_lblk_t block, @@ -1824,24 +1806,15 @@ int ext4_find_dest_de(struct inode *dir, struct inode *inode, int nlen, rlen; unsigned int offset = 0; char *top; - int res;
de = (struct ext4_dir_entry_2 *)buf; top = buf + buf_size - reclen; while ((char *) de <= top) { if (ext4_check_dir_entry(dir, NULL, de, bh, - buf, buf_size, offset)) { - res = -EFSCORRUPTED; - goto return_result; - } - /* Provide crypto context and crypto buffer to ext4 match */ - res = ext4_match(fname, de); - if (res < 0) - goto return_result; - if (res > 0) { - res = -EEXIST; - goto return_result; - } + buf, buf_size, offset)) + return -EFSCORRUPTED; + if (ext4_match(fname, de)) + return -EEXIST; nlen = EXT4_DIR_REC_LEN(de->name_len); rlen = ext4_rec_len_from_disk(de->rec_len, buf_size); if ((de->inode ? rlen - nlen : rlen) >= reclen) @@ -1849,15 +1822,11 @@ int ext4_find_dest_de(struct inode *dir, struct inode *inode, de = (struct ext4_dir_entry_2 *)((char *)de + rlen); offset += rlen; } - if ((char *) de > top) - res = -ENOSPC; - else { - *dest_de = de; - res = 0; - } -return_result: - return res; + return -ENOSPC; + + *dest_de = de; + return 0; }
int ext4_insert_dentry(struct inode *dir,
From: Jan Kara jack@suse.cz
[ Upstream commit 7303cb5bfe845f7d43cd9b2dbd37dbb266efda9b ]
ext4_search_dir() and ext4_generic_delete_entry() can be called both for standard director blocks and for inline directories stored inside inode or inline xattr space. For the second case we didn't call ext4_check_dir_entry() with proper constraints that could result in accepting corrupted directory entry as well as false positive filesystem errors like:
EXT4-fs error (device dm-0): ext4_search_dir:1395: inode #28320400: block 113246792: comm dockerd: bad entry in directory: directory entry too close to block end - offset=0, inode=28320403, rec_len=32, name_len=8, size=4096
Fix the arguments passed to ext4_check_dir_entry().
Fixes: 109ba779d6cc ("ext4: check for directory entries too close to block end") CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20200731162135.8080-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/namei.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index eb41fb189d4b3..faf142a6fa8bb 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1279,8 +1279,8 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size, ext4_match(fname, de)) { /* found a match - just to be sure, do * a full check */ - if (ext4_check_dir_entry(dir, NULL, de, bh, bh->b_data, - bh->b_size, offset)) + if (ext4_check_dir_entry(dir, NULL, de, bh, search_buf, + buf_size, offset)) return -1; *res_dir = de; return 1; @@ -2312,7 +2312,7 @@ int ext4_generic_delete_entry(handle_t *handle, de = (struct ext4_dir_entry_2 *)entry_buf; while (i < buf_size - csum_size) { if (ext4_check_dir_entry(dir, NULL, de, bh, - bh->b_data, bh->b_size, i)) + entry_buf, buf_size, i)) return -EFSCORRUPTED; if (de == de_del) { if (pde)
From: Chuhong Yuan hslester96@gmail.com
[ Upstream commit fc0456458df8b3421dba2a5508cd817fbc20ea71 ]
budget_register() has no error handling after its failure. Add the missed undo functions for error handling to fix it.
Signed-off-by: Chuhong Yuan hslester96@gmail.com Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/ttpci/budget-core.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/media/pci/ttpci/budget-core.c b/drivers/media/pci/ttpci/budget-core.c index e9674b40007c1..6107c469efa07 100644 --- a/drivers/media/pci/ttpci/budget-core.c +++ b/drivers/media/pci/ttpci/budget-core.c @@ -386,20 +386,25 @@ static int budget_register(struct budget *budget) ret = dvbdemux->dmx.add_frontend(&dvbdemux->dmx, &budget->hw_frontend);
if (ret < 0) - return ret; + goto err_release_dmx;
budget->mem_frontend.source = DMX_MEMORY_FE; ret = dvbdemux->dmx.add_frontend(&dvbdemux->dmx, &budget->mem_frontend); if (ret < 0) - return ret; + goto err_release_dmx;
ret = dvbdemux->dmx.connect_frontend(&dvbdemux->dmx, &budget->hw_frontend); if (ret < 0) - return ret; + goto err_release_dmx;
dvb_net_init(&budget->dvb_adapter, &budget->dvb_net, &dvbdemux->dmx);
return 0; + +err_release_dmx: + dvb_dmxdev_release(&budget->dmxdev); + dvb_dmx_release(&budget->demux); + return ret; }
static void budget_unregister(struct budget *budget)
From: Evgeny Novikov novikov@ispras.ru
[ Upstream commit 9c487b0b0ea7ff22127fe99a7f67657d8730ff94 ]
If platform_driver_register() fails within vpss_init() resources are not cleaned up. The patch fixes this issue by introducing the corresponding error handling.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov novikov@ispras.ru Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/davinci/vpss.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/media/platform/davinci/vpss.c b/drivers/media/platform/davinci/vpss.c index c2c68988e38ac..9884b34d6f406 100644 --- a/drivers/media/platform/davinci/vpss.c +++ b/drivers/media/platform/davinci/vpss.c @@ -519,19 +519,31 @@ static void vpss_exit(void)
static int __init vpss_init(void) { + int ret; + if (!request_mem_region(VPSS_CLK_CTRL, 4, "vpss_clock_control")) return -EBUSY;
oper_cfg.vpss_regs_base2 = ioremap(VPSS_CLK_CTRL, 4); if (unlikely(!oper_cfg.vpss_regs_base2)) { - release_mem_region(VPSS_CLK_CTRL, 4); - return -ENOMEM; + ret = -ENOMEM; + goto err_ioremap; }
writel(VPSS_CLK_CTRL_VENCCLKEN | - VPSS_CLK_CTRL_DACCLKEN, oper_cfg.vpss_regs_base2); + VPSS_CLK_CTRL_DACCLKEN, oper_cfg.vpss_regs_base2); + + ret = platform_driver_register(&vpss_driver); + if (ret) + goto err_pd_register; + + return 0;
- return platform_driver_register(&vpss_driver); +err_pd_register: + iounmap(oper_cfg.vpss_regs_base2); +err_ioremap: + release_mem_region(VPSS_CLK_CTRL, 4); + return ret; } subsys_initcall(vpss_init); module_exit(vpss_exit);
From: Xiongfeng Wang wangxiongfeng2@huawei.com
[ Upstream commit 4aec14de3a15cf9789a0e19c847f164776f49473 ]
When I cat parameter 'proto' by sysfs, it displays as follows. It's better to add a newline for easy reading.
root@syzkaller:~# cat /sys/module/psmouse/parameters/proto autoroot@syzkaller:~#
Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Link: https://lore.kernel.org/r/20200720073846.120724-1-wangxiongfeng2@huawei.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/mouse/psmouse-base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c index ad18dab0ac476..5bd9633541b07 100644 --- a/drivers/input/mouse/psmouse-base.c +++ b/drivers/input/mouse/psmouse-base.c @@ -1911,7 +1911,7 @@ static int psmouse_get_maxproto(char *buffer, const struct kernel_param *kp) { int type = *((unsigned int *)kp->arg);
- return sprintf(buffer, "%s", psmouse_protocol_by_type(type)->name); + return sprintf(buffer, "%s\n", psmouse_protocol_by_type(type)->name); }
static int __init psmouse_init(void)
From: Greg Ungerer gerg@linux-m68k.org
[ Upstream commit bdee0e793cea10c516ff48bf3ebb4ef1820a116b ]
The Cache Control Register (CACR) of the ColdFire V3 has bits that control high level caching functions, and also enable/disable the use of the alternate stack pointer register (the EUSP bit) to provide separate supervisor and user stack pointer registers. The code as it is today will blindly clear the EUSP bit on cache actions like invalidation. So it is broken for this case - and that will result in failed booting (interrupt entry and exit processing will be completely hosed).
This only affects ColdFire V3 parts that support the alternate stack register (like the 5329 for example) - generally speaking new parts do, older parts don't. It has no impact on ColdFire V3 parts with the single stack pointer, like the 5307 for example.
Fix the cache bit defines used, so they maintain the EUSP bit when carrying out cache actions through the CACR register.
Signed-off-by: Greg Ungerer gerg@linux-m68k.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/m68k/include/asm/m53xxacr.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/m68k/include/asm/m53xxacr.h b/arch/m68k/include/asm/m53xxacr.h index 3177ce8331d69..baee0c77b9818 100644 --- a/arch/m68k/include/asm/m53xxacr.h +++ b/arch/m68k/include/asm/m53xxacr.h @@ -88,9 +88,9 @@ * coherency though in all cases. And for copyback caches we will need * to push cached data as well. */ -#define CACHE_INIT CACR_CINVA -#define CACHE_INVALIDATE CACR_CINVA -#define CACHE_INVALIDATED CACR_CINVA +#define CACHE_INIT (CACHE_MODE + CACR_CINVA - CACR_EC) +#define CACHE_INVALIDATE (CACHE_MODE + CACR_CINVA) +#define CACHE_INVALIDATED (CACHE_MODE + CACR_CINVA)
#define ACR0_MODE ((CONFIG_RAMBASE & 0xff000000) + \ (0x000f0000) + \
From: Darrick J. Wong darrick.wong@oracle.com
[ Upstream commit f959b5d037e71a4d69b5bf71faffa065d9269b4a ]
xfs_trans_dqresv is the function that we use to make reservations against resource quotas. Each resource contains two counters: the q_core counter, which tracks resources allocated on disk; and the dquot reservation counter, which tracks how much of that resource has either been allocated or reserved by threads that are working on metadata updates.
For disk blocks, we compare the proposed reservation counter against the hard and soft limits to decide if we're going to fail the operation. However, for inodes we inexplicably compare against the q_core counter, not the incore reservation count.
Since the q_core counter is always lower than the reservation count and we unlock the dquot between reservation and transaction commit, this means that multiple threads can reserve the last inode count before we hit the hard limit, and when they commit, we'll be well over the hard limit.
Fix this by checking against the incore inode reservation counter, since we would appear to maintain that correctly (and that's what we report in GETQUOTA).
Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Reviewed-by: Allison Collins allison.henderson@oracle.com Reviewed-by: Chandan Babu R chandanrlinux@gmail.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_trans_dquot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c index ce78534a047ee..bb8de2dddabe2 100644 --- a/fs/xfs/xfs_trans_dquot.c +++ b/fs/xfs/xfs_trans_dquot.c @@ -662,7 +662,7 @@ xfs_trans_dqresv( } } if (ninos > 0) { - total_count = be64_to_cpu(dqp->q_core.d_icount) + ninos; + total_count = dqp->q_res_icount + ninos; timer = be32_to_cpu(dqp->q_core.d_itimer); warns = be16_to_cpu(dqp->q_core.d_iwarns); warnlimit = dqp->q_mount->m_quotainfo->qi_iwarnlimit;
From: Zhe Li lizhe67@huawei.com
[ Upstream commit 798b7347e4f29553db4b996393caf12f5b233daf ]
The log of UAF problem is listed below. BUG: KASAN: use-after-free in jffs2_rmdir+0xa4/0x1cc [jffs2] at addr c1f165fc Read of size 4 by task rm/8283 ============================================================================= BUG kmalloc-32 (Tainted: P B O ): kasan: bad access detected -----------------------------------------------------------------------------
INFO: Allocated in 0xbbbbbbbb age=3054364 cpu=0 pid=0 0xb0bba6ef jffs2_write_dirent+0x11c/0x9c8 [jffs2] __slab_alloc.isra.21.constprop.25+0x2c/0x44 __kmalloc+0x1dc/0x370 jffs2_write_dirent+0x11c/0x9c8 [jffs2] jffs2_do_unlink+0x328/0x5fc [jffs2] jffs2_rmdir+0x110/0x1cc [jffs2] vfs_rmdir+0x180/0x268 do_rmdir+0x2cc/0x300 ret_from_syscall+0x0/0x3c INFO: Freed in 0x205b age=3054364 cpu=0 pid=0 0x2e9173 jffs2_add_fd_to_list+0x138/0x1dc [jffs2] jffs2_add_fd_to_list+0x138/0x1dc [jffs2] jffs2_garbage_collect_dirent.isra.3+0x21c/0x288 [jffs2] jffs2_garbage_collect_live+0x16bc/0x1800 [jffs2] jffs2_garbage_collect_pass+0x678/0x11d4 [jffs2] jffs2_garbage_collect_thread+0x1e8/0x3b0 [jffs2] kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x64 Call Trace: [c17ddd20] [c02452d4] kasan_report.part.0+0x298/0x72c (unreliable) [c17ddda0] [d2509680] jffs2_rmdir+0xa4/0x1cc [jffs2] [c17dddd0] [c026da04] vfs_rmdir+0x180/0x268 [c17dde00] [c026f4e4] do_rmdir+0x2cc/0x300 [c17ddf40] [c001a658] ret_from_syscall+0x0/0x3c
The root cause is that we don't get "jffs2_inode_info.sem" before we scan list "jffs2_inode_info.dents" in function jffs2_rmdir. This patch add codes to get "jffs2_inode_info.sem" before we scan "jffs2_inode_info.dents" to slove the UAF problem.
Signed-off-by: Zhe Li lizhe67@huawei.com Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jffs2/dir.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c index e273171696972..7a3368929245d 100644 --- a/fs/jffs2/dir.c +++ b/fs/jffs2/dir.c @@ -588,10 +588,14 @@ static int jffs2_rmdir (struct inode *dir_i, struct dentry *dentry) int ret; uint32_t now = get_seconds();
+ mutex_lock(&f->sem); for (fd = f->dents ; fd; fd = fd->next) { - if (fd->ino) + if (fd->ino) { + mutex_unlock(&f->sem); return -ENOTEMPTY; + } } + mutex_unlock(&f->sem);
ret = jffs2_do_unlink(c, dir_f, dentry->d_name.name, dentry->d_name.len, f, now);
From: Javed Hasan jhasan@marvell.com
[ Upstream commit ec007ef40abb6a164d148b0dc19789a7a2de2cc8 ]
In fc_disc_gpn_id_resp(), skb is supposed to get freed in all cases except for PTR_ERR. However, in some cases it didn't.
This fix is to call fc_frame_free(fp) before function returns.
Link: https://lore.kernel.org/r/20200729081824.30996-2-jhasan@marvell.com Reviewed-by: Girish Basrur gbasrur@marvell.com Reviewed-by: Santosh Vernekar svernekar@marvell.com Reviewed-by: Saurav Kashyap skashyap@marvell.com Reviewed-by: Shyam Sundar ssundar@marvell.com Signed-off-by: Javed Hasan jhasan@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/libfc/fc_disc.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c index 880a9068ca126..ef06af4e3611d 100644 --- a/drivers/scsi/libfc/fc_disc.c +++ b/drivers/scsi/libfc/fc_disc.c @@ -595,8 +595,12 @@ static void fc_disc_gpn_id_resp(struct fc_seq *sp, struct fc_frame *fp, mutex_lock(&disc->disc_mutex); if (PTR_ERR(fp) == -FC_EX_CLOSED) goto out; - if (IS_ERR(fp)) - goto redisc; + if (IS_ERR(fp)) { + mutex_lock(&disc->disc_mutex); + fc_disc_restart(disc); + mutex_unlock(&disc->disc_mutex); + goto out; + }
cp = fc_frame_payload_get(fp, sizeof(*cp)); if (!cp) @@ -621,7 +625,7 @@ static void fc_disc_gpn_id_resp(struct fc_seq *sp, struct fc_frame *fp, new_rdata->disc_id = disc->disc_id; lport->tt.rport_login(new_rdata); } - goto out; + goto free_fp; } rdata->disc_id = disc->disc_id; lport->tt.rport_login(rdata); @@ -635,6 +639,8 @@ static void fc_disc_gpn_id_resp(struct fc_seq *sp, struct fc_frame *fp, redisc: fc_disc_restart(disc); } +free_fp: + fc_frame_free(fp); out: mutex_unlock(&disc->disc_mutex); kref_put(&rdata->kref, lport->tt.rport_destroy);
From: Mao Wenan wenan.mao@linux.alibaba.com
[ Upstream commit 481a0d7422db26fb63e2d64f0652667a5c6d0f3e ]
The loop may exist if vq->broken is true, virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split will return NULL, so virtnet_poll will reschedule napi to receive packet, it will lead cpu usage(si) to 100%.
call trace as below: virtnet_poll virtnet_receive virtqueue_get_buf_ctx virtqueue_get_buf_ctx_packed virtqueue_get_buf_ctx_split virtqueue_napi_complete virtqueue_poll //return true virtqueue_napi_schedule //it will reschedule napi
to fix this, return false if vq is broken in virtqueue_poll.
Signed-off-by: Mao Wenan wenan.mao@linux.alibaba.com Acked-by: Michael S. Tsirkin mst@redhat.com Link: https://lore.kernel.org/r/1596354249-96204-1-git-send-email-wenan.mao@linux.... Signed-off-by: Michael S. Tsirkin mst@redhat.com Acked-by: Jason Wang jasowang@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/virtio/virtio_ring.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index a01a41a412693..6b3565feddb21 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -603,6 +603,9 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx) { struct vring_virtqueue *vq = to_vvq(_vq);
+ if (unlikely(vq->broken)) + return false; + virtio_mb(vq->weak_barriers); return (u16)last_used_idx != virtio16_to_cpu(_vq->vdev, vq->vring.used->idx); }
From: Eiichi Tsukata devel@etsukata.com
[ Upstream commit 96cf2a2c75567ff56195fe3126d497a2e7e4379f ]
If xfs_sysfs_init is called with parent_kobj == NULL, UBSAN shows the following warning:
UBSAN: null-ptr-deref in ./fs/xfs/xfs_sysfs.h:37:23 member access within null pointer of type 'struct xfs_kobj' Call Trace: dump_stack+0x10e/0x195 ubsan_type_mismatch_common+0x241/0x280 __ubsan_handle_type_mismatch_v1+0x32/0x40 init_xfs_fs+0x12b/0x28f do_one_initcall+0xdd/0x1d0 do_initcall_level+0x151/0x1b6 do_initcalls+0x50/0x8f do_basic_setup+0x29/0x2b kernel_init_freeable+0x19f/0x20b kernel_init+0x11/0x1e0 ret_from_fork+0x22/0x30
Fix it by checking parent_kobj before the code accesses its member.
Signed-off-by: Eiichi Tsukata devel@etsukata.com Reviewed-by: Darrick J. Wong darrick.wong@oracle.com [darrick: minor whitespace edits] Signed-off-by: Darrick J. Wong darrick.wong@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_sysfs.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_sysfs.h b/fs/xfs/xfs_sysfs.h index be692e59938db..c457b010c623d 100644 --- a/fs/xfs/xfs_sysfs.h +++ b/fs/xfs/xfs_sysfs.h @@ -44,9 +44,11 @@ xfs_sysfs_init( struct xfs_kobj *parent_kobj, const char *name) { + struct kobject *parent; + + parent = parent_kobj ? &parent_kobj->kobject : NULL; init_completion(&kobj->complete); - return kobject_init_and_add(&kobj->kobject, ktype, - &parent_kobj->kobject, "%s", name); + return kobject_init_and_add(&kobj->kobject, ktype, parent, "%s", name); }
static inline void
From: Luc Van Oostenryck luc.vanoostenryck@gmail.com
[ Upstream commit bd72866b8da499e60633ff28f8a4f6e09ca78efe ]
These accessors must be used to read/write a big-endian bus. The value returned or written is native-endian.
However, these accessors are defined using be{16,32}_to_cpu() or cpu_to_be{16,32}() to make the endian conversion but these expect a __be{16,32} when none is present. Keeping them would need a force cast that would solve nothing at all.
So, do the conversion using swab{16,32}, like done in asm-generic for similar situations.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Luc Van Oostenryck luc.vanoostenryck@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Cc: Richard Henderson rth@twiddle.net Cc: Ivan Kokshaysky ink@jurassic.park.msu.ru Cc: Matt Turner mattst88@gmail.com Cc: Stephen Boyd sboyd@kernel.org Cc: Arnd Bergmann arnd@arndb.de Link: http://lkml.kernel.org/r/20200622114232.80039-1-luc.vanoostenryck@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/alpha/include/asm/io.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h index ff4049155c840..355aec0867f4d 100644 --- a/arch/alpha/include/asm/io.h +++ b/arch/alpha/include/asm/io.h @@ -491,10 +491,10 @@ extern inline void writeq(u64 b, volatile void __iomem *addr) } #endif
-#define ioread16be(p) be16_to_cpu(ioread16(p)) -#define ioread32be(p) be32_to_cpu(ioread32(p)) -#define iowrite16be(v,p) iowrite16(cpu_to_be16(v), (p)) -#define iowrite32be(v,p) iowrite32(cpu_to_be32(v), (p)) +#define ioread16be(p) swab16(ioread16(p)) +#define ioread32be(p) swab32(ioread32(p)) +#define iowrite16be(v,p) iowrite16(swab16(v), (p)) +#define iowrite32be(v,p) iowrite32(swab32(v), (p))
#define inb_p inb #define inw_p inw
From: Eric Sandeen sandeen@redhat.com
[ Upstream commit 5872331b3d91820e14716632ebb56b1399b34fe1 ]
If for any reason a directory passed to do_split() does not have enough active entries to exceed half the size of the block, we can end up iterating over all "count" entries without finding a split point.
In this case, count == move, and split will be zero, and we will attempt a negative index into map[].
Guard against this by detecting this case, and falling back to split-to-half-of-count instead; in this case we will still have plenty of space (> half blocksize) in each split block.
Fixes: ef2b02d3e617 ("ext34: ensure do_split leaves enough free space in both blocks") Signed-off-by: Eric Sandeen sandeen@redhat.com Reviewed-by: Andreas Dilger adilger@dilger.ca Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/f53e246b-647c-64bb-16ec-135383c70ad7@redhat.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/namei.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index faf142a6fa8bb..061b026e464c5 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1730,7 +1730,7 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, blocksize, hinfo, map); map -= count; dx_sort_map(map, count); - /* Split the existing block in the middle, size-wise */ + /* Ensure that neither split block is over half full */ size = 0; move = 0; for (i = count-1; i >= 0; i--) { @@ -1740,8 +1740,18 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir, size += map[i].size; move++; } - /* map index at which we will split */ - split = count - move; + /* + * map index at which we will split + * + * If the sum of active entries didn't exceed half the block size, just + * split it in half by count; each resulting block will have at least + * half the space free. + */ + if (i > 0) + split = count - move; + else + split = count/2; + hash2 = map[split].hash; continued = hash2 == map[split - 1].hash; dxtrace(printk(KERN_INFO "Split block %lu at %x, %i/%i\n",
From: Dinghao Liu dinghao.liu@zju.edu.cn
[ Upstream commit 062fa09f44f4fb3776a23184d5d296b0c8872eb9 ]
When power_up_sst() fails, stream needs to be freed just like when try_module_get() fails. However, current code is returning directly and ends up leaking memory.
Fixes: 0121327c1a68b ("ASoC: Intel: mfld-pcm: add control for powering up/down dsp") Signed-off-by: Dinghao Liu dinghao.liu@zju.edu.cn Acked-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20200813084112.26205-1-dinghao.liu@zju.edu.cn Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c b/sound/soc/intel/atom/sst-mfld-platform-pcm.c index 1d9dfb92b3b48..edb244331e6e9 100644 --- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -338,7 +338,7 @@ static int sst_media_open(struct snd_pcm_substream *substream,
ret_val = power_up_sst(stream); if (ret_val < 0) - return ret_val; + goto out_power_up;
/* Make sure, that the period size is always even */ snd_pcm_hw_constraint_step(substream->runtime, 0, @@ -347,8 +347,9 @@ static int sst_media_open(struct snd_pcm_substream *substream, return snd_pcm_hw_constraint_integer(runtime, SNDRV_PCM_HW_PARAM_PERIODS); out_ops: - kfree(stream); mutex_unlock(&sst_lock); +out_power_up: + kfree(stream); return ret_val; }
From: Michael Ellerman mpe@ellerman.id.au
commit 63dee5df43a31f3844efabc58972f0a206ca4534 upstream.
We have powerpc specific logic in our page fault handling to decide if an access to an unmapped address below the stack pointer should expand the stack VMA.
The code was originally added in 2004 "ported from 2.4". The rough logic is that the stack is allowed to grow to 1MB with no extra checking. Over 1MB the access must be within 2048 bytes of the stack pointer, or be from a user instruction that updates the stack pointer.
The 2048 byte allowance below the stack pointer is there to cover the 288 byte "red zone" as well as the "about 1.5kB" needed by the signal delivery code.
Unfortunately since then the signal frame has expanded, and is now 4224 bytes on 64-bit kernels with transactional memory enabled. This means if a process has consumed more than 1MB of stack, and its stack pointer lies less than 4224 bytes from the next page boundary, signal delivery will fault when trying to expand the stack and the process will see a SEGV.
The total size of the signal frame is the size of struct rt_sigframe (which includes the red zone) plus __SIGNAL_FRAMESIZE (128 bytes on 64-bit).
The 2048 byte allowance was correct until 2008 as the signal frame was:
struct rt_sigframe { struct ucontext uc; /* 0 1440 */ /* --- cacheline 11 boundary (1408 bytes) was 32 bytes ago --- */ long unsigned int _unused[2]; /* 1440 16 */ unsigned int tramp[6]; /* 1456 24 */ struct siginfo * pinfo; /* 1480 8 */ void * puc; /* 1488 8 */ struct siginfo info; /* 1496 128 */ /* --- cacheline 12 boundary (1536 bytes) was 88 bytes ago --- */ char abigap[288]; /* 1624 288 */
/* size: 1920, cachelines: 15, members: 7 */ /* padding: 8 */ };
1920 + 128 = 2048
Then in commit ce48b2100785 ("powerpc: Add VSX context save/restore, ptrace and signal support") (Jul 2008) the signal frame expanded to 2304 bytes:
struct rt_sigframe { struct ucontext uc; /* 0 1696 */ <-- /* --- cacheline 13 boundary (1664 bytes) was 32 bytes ago --- */ long unsigned int _unused[2]; /* 1696 16 */ unsigned int tramp[6]; /* 1712 24 */ struct siginfo * pinfo; /* 1736 8 */ void * puc; /* 1744 8 */ struct siginfo info; /* 1752 128 */ /* --- cacheline 14 boundary (1792 bytes) was 88 bytes ago --- */ char abigap[288]; /* 1880 288 */
/* size: 2176, cachelines: 17, members: 7 */ /* padding: 8 */ };
2176 + 128 = 2304
At this point we should have been exposed to the bug, though as far as I know it was never reported. I no longer have a system old enough to easily test on.
Then in 2010 commit 320b2b8de126 ("mm: keep a guard page below a grow-down stack segment") caused our stack expansion code to never trigger, as there was always a VMA found for a write up to PAGE_SIZE below r1.
That meant the bug was hidden as we continued to expand the signal frame in commit 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") (Feb 2013):
struct rt_sigframe { struct ucontext uc; /* 0 1696 */ /* --- cacheline 13 boundary (1664 bytes) was 32 bytes ago --- */ struct ucontext uc_transact; /* 1696 1696 */ <-- /* --- cacheline 26 boundary (3328 bytes) was 64 bytes ago --- */ long unsigned int _unused[2]; /* 3392 16 */ unsigned int tramp[6]; /* 3408 24 */ struct siginfo * pinfo; /* 3432 8 */ void * puc; /* 3440 8 */ struct siginfo info; /* 3448 128 */ /* --- cacheline 27 boundary (3456 bytes) was 120 bytes ago --- */ char abigap[288]; /* 3576 288 */
/* size: 3872, cachelines: 31, members: 8 */ /* padding: 8 */ /* last cacheline: 32 bytes */ };
3872 + 128 = 4000
And commit 573ebfa6601f ("powerpc: Increase stack redzone for 64-bit userspace to 512 bytes") (Feb 2014):
struct rt_sigframe { struct ucontext uc; /* 0 1696 */ /* --- cacheline 13 boundary (1664 bytes) was 32 bytes ago --- */ struct ucontext uc_transact; /* 1696 1696 */ /* --- cacheline 26 boundary (3328 bytes) was 64 bytes ago --- */ long unsigned int _unused[2]; /* 3392 16 */ unsigned int tramp[6]; /* 3408 24 */ struct siginfo * pinfo; /* 3432 8 */ void * puc; /* 3440 8 */ struct siginfo info; /* 3448 128 */ /* --- cacheline 27 boundary (3456 bytes) was 120 bytes ago --- */ char abigap[512]; /* 3576 512 */ <--
/* size: 4096, cachelines: 32, members: 8 */ /* padding: 8 */ };
4096 + 128 = 4224
Then finally in 2017, commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") exposed us to the existing bug, because it changed the stack VMA to be the correct/real size, meaning our stack expansion code is now triggered.
Fix it by increasing the allowance to 4224 bytes.
Hard-coding 4224 is obviously unsafe against future expansions of the signal frame in the same way as the existing code. We can't easily use sizeof() because the signal frame structure is not in a header. We will either fix that, or rip out all the custom stack expansion checking logic entirely.
Fixes: ce48b2100785 ("powerpc: Add VSX context save/restore, ptrace and signal support") Cc: stable@vger.kernel.org # v2.6.27+ Reported-by: Tom Lane tgl@sss.pgh.pa.us Tested-by: Daniel Axtens dja@axtens.net Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20200724092528.1578671-2-mpe@ellerman.id.au Signed-off-by: Daniel Axtens dja@axtens.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/mm/fault.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -192,6 +192,9 @@ static int mm_fault_error(struct pt_regs return MM_FAULT_CONTINUE; }
+// This comes from 64-bit struct rt_sigframe + __SIGNAL_FRAMESIZE +#define SIGFRAME_MAX_SIZE (4096 + 128) + /* * For 600- and 800-family processors, the error_code parameter is DSISR * for a data fault, SRR1 for an instruction fault. For 400-family processors @@ -341,7 +344,7 @@ retry: /* * N.B. The POWER/Open ABI allows programs to access up to * 288 bytes below the stack pointer. - * The kernel signal delivery code writes up to about 1.5kB + * The kernel signal delivery code writes up to about 4kB * below the stack pointer (r1) before decrementing it. * The exec code can write slightly over 640kB to the stack * before setting the user r1. Thus we allow the stack to @@ -365,7 +368,7 @@ retry: * between the last mapped region and the stack will * expand the stack rather than segfaulting. */ - if (address + 2048 < uregs->gpr[1] && !store_update_sp) + if (address + SIGFRAME_MAX_SIZE < uregs->gpr[1] && !store_update_sp) goto bad_area; } if (expand_stack(vma, address))
From: Marc Zyngier maz@kernel.org
commit a9ed4a6560b8562b7e2e2bed9527e88001f7b682 upstream.
When adding a new fd to an epoll, and that this new fd is an epoll fd itself, we recursively scan the fds attached to it to detect cycles, and add non-epool files to a "check list" that gets subsequently parsed.
However, this check list isn't completely safe when deletions can happen concurrently. To sidestep the issue, make sure that a struct file placed on the check list sees its f_count increased, ensuring that a concurrent deletion won't result in the file disapearing from under our feet.
Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/eventpoll.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1719,9 +1719,11 @@ static int ep_loop_check_proc(void *priv * not already there, and calling reverse_path_check() * during ep_insert(). */ - if (list_empty(&epi->ffd.file->f_tfile_llink)) + if (list_empty(&epi->ffd.file->f_tfile_llink)) { + get_file(epi->ffd.file); list_add(&epi->ffd.file->f_tfile_llink, &tfile_check_list); + } } } mutex_unlock(&ep->mtx); @@ -1765,6 +1767,7 @@ static void clear_tfile_check_list(void) file = list_first_entry(&tfile_check_list, struct file, f_tfile_llink); list_del_init(&file->f_tfile_llink); + fput(file); } INIT_LIST_HEAD(&tfile_check_list); } @@ -1906,9 +1909,11 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, in clear_tfile_check_list(); goto error_tgt_fput; } - } else + } else { + get_file(tf.file); list_add(&tf.file->f_tfile_llink, &tfile_check_list); + } mutex_lock_nested(&ep->mtx, 0); if (is_file_epoll(tf.file)) { tep = tf.file->private_data;
From: Al Viro viro@zeniv.linux.org.uk
commit 52c479697c9b73f628140dcdfcd39ea302d05482 upstream.
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Marc Zyngier maz@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1905,10 +1905,8 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, in mutex_lock(&epmutex); if (is_file_epoll(tf.file)) { error = -ELOOP; - if (ep_loop_check(ep, tf.file) != 0) { - clear_tfile_check_list(); + if (ep_loop_check(ep, tf.file) != 0) goto error_tgt_fput; - } } else { get_file(tf.file); list_add(&tf.file->f_tfile_llink, @@ -1937,8 +1935,6 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, in error = ep_insert(ep, &epds, tf.file, fd, full_check); } else error = -EEXIST; - if (full_check) - clear_tfile_check_list(); break; case EPOLL_CTL_DEL: if (epi) @@ -1959,8 +1955,10 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, in mutex_unlock(&ep->mtx);
error_tgt_fput: - if (full_check) + if (full_check) { + clear_tfile_check_list(); mutex_unlock(&epmutex); + }
fdput(tf); error_fput:
From: Peter Xu peterx@redhat.com
commit 75802ca66354a39ab8e35822747cd08b3384a99a upstream.
This is found by code observation only.
Firstly, the worst case scenario should assume the whole range was covered by pmd sharing. The old algorithm might not work as expected for ranges like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the expected range should be (0, 2g).
Since at it, remove the loop since it should not be required. With that, the new code should be faster too when the invalidating range is huge.
Mike said:
: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only : adjust to (0, 1g+2m) which is incorrect. : : We should cc stable. The original reason for adjusting the range was to : prevent data corruption (getting wrong page). Since the range is not : always adjusted correctly, the potential for corruption still exists. : : However, I am fairly confident that adjust_range_if_pmd_sharing_possible : is only gong to be called in two cases: : : 1) for a single page : 2) for range == entire vma : : In those cases, the current code should produce the correct results. : : To be safe, let's just cc stable.
Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages") Signed-off-by: Peter Xu peterx@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Reviewed-by: Mike Kravetz mike.kravetz@oracle.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4257,6 +4257,7 @@ static bool vma_shareable(struct vm_area return false; }
+#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a)) /* * Determine if start,end range within vma could be mapped by shared pmd. * If yes, adjust start and end to cover range associated with possible @@ -4265,25 +4266,21 @@ static bool vma_shareable(struct vm_area void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, unsigned long *start, unsigned long *end) { - unsigned long check_addr = *start; + unsigned long a_start, a_end;
if (!(vma->vm_flags & VM_MAYSHARE)) return;
- for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) { - unsigned long a_start = check_addr & PUD_MASK; - unsigned long a_end = a_start + PUD_SIZE; + /* Extend the range to be PUD aligned for a worst case scenario */ + a_start = ALIGN_DOWN(*start, PUD_SIZE); + a_end = ALIGN(*end, PUD_SIZE);
- /* - * If sharing is possible, adjust start/end if necessary. - */ - if (range_in_vma(vma, a_start, a_end)) { - if (a_start < *start) - *start = a_start; - if (a_end > *end) - *end = a_end; - } - } + /* + * Intersect the range with the vma range, since pmd sharing won't be + * across vma after all + */ + *start = max(vma->vm_start, a_start); + *end = min(vma->vm_end, a_end); }
/*
From: Juergen Gross jgross@suse.com
For support of long running hypercalls xen_maybe_preempt_hcall() is calling cond_resched() in case a hypercall marked as preemptible has been interrupted.
Normally this is no problem, as only hypercalls done via some ioctl()s are marked to be preemptible. In rare cases when during such a preemptible hypercall an interrupt occurs and any softirq action is started from irq_exit(), a further hypercall issued by the softirq handler will be regarded to be preemptible, too. This might lead to rescheduling in spite of the softirq handler potentially having set preempt_disable(), leading to splats like:
BUG: sleeping function called from invalid context at drivers/xen/preempt.c:37 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 20775, name: xl INFO: lockdep is turned off. CPU: 1 PID: 20775 Comm: xl Tainted: G D W 5.4.46-1_prgmr_debug.el7.x86_64 #1 Call Trace: <IRQ> dump_stack+0x8f/0xd0 ___might_sleep.cold.76+0xb2/0x103 xen_maybe_preempt_hcall+0x48/0x70 xen_do_hypervisor_callback+0x37/0x40 RIP: e030:xen_hypercall_xen_version+0xa/0x20 Code: ... RSP: e02b:ffffc900400dcc30 EFLAGS: 00000246 RAX: 000000000004000d RBX: 0000000000000200 RCX: ffffffff8100122a RDX: ffff88812e788000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffffff83ee3ad0 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000246 R12: ffff8881824aa0b0 R13: 0000000865496000 R14: 0000000865496000 R15: ffff88815d040000 ? xen_hypercall_xen_version+0xa/0x20 ? xen_force_evtchn_callback+0x9/0x10 ? check_events+0x12/0x20 ? xen_restore_fl_direct+0x1f/0x20 ? _raw_spin_unlock_irqrestore+0x53/0x60 ? debug_dma_sync_single_for_cpu+0x91/0xc0 ? _raw_spin_unlock_irqrestore+0x53/0x60 ? xen_swiotlb_sync_single_for_cpu+0x3d/0x140 ? mlx4_en_process_rx_cq+0x6b6/0x1110 [mlx4_en] ? mlx4_en_poll_rx_cq+0x64/0x100 [mlx4_en] ? net_rx_action+0x151/0x4a0 ? __do_softirq+0xed/0x55b ? irq_exit+0xea/0x100 ? xen_evtchn_do_upcall+0x2c/0x40 ? xen_do_hypervisor_callback+0x29/0x40 </IRQ> ? xen_hypercall_domctl+0xa/0x20 ? xen_hypercall_domctl+0x8/0x20 ? privcmd_ioctl+0x221/0x990 [xen_privcmd] ? do_vfs_ioctl+0xa5/0x6f0 ? ksys_ioctl+0x60/0x90 ? trace_hardirqs_off_thunk+0x1a/0x20 ? __x64_sys_ioctl+0x16/0x20 ? do_syscall_64+0x62/0x250 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fix that by testing preempt_count() before calling cond_resched().
In kernel 5.8 this can't happen any more due to the entry code rework (more than 100 patches, so not a candidate for backporting).
The issue was introduced in kernel 4.3, so this patch should go into all stable kernels in [4.3 ... 5.7].
Reported-by: Sarah Newman srn@prgmr.com Fixes: 0fa2f5cb2b0ecd8 ("sched/preempt, xen: Use need_resched() instead of should_resched()") Cc: Sarah Newman srn@prgmr.com Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross jgross@suse.com Tested-by: Chris Brannon cmb@prgmr.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/preempt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/xen/preempt.c +++ b/drivers/xen/preempt.c @@ -31,7 +31,7 @@ EXPORT_SYMBOL_GPL(xen_in_preemptible_hca asmlinkage __visible void xen_maybe_preempt_hcall(void) { if (unlikely(__this_cpu_read(xen_in_preemptible_hcall) - && need_resched())) { + && need_resched() && !preempt_count())) { /* * Clear flag as we may be rescheduled on a different * cpu.
From: Adam Ford aford173@gmail.com
There appears to be a timing issue where using a divider of 32 breaks the DSS for OMAP36xx despite the TRM stating 32 is a valid number. Through experimentation, it appears that 31 works.
This same fix was issued for kernels 4.5+. However, between kernels 4.4 and 4.5, the directory structure was changed when the dss directory was moved inside the omapfb directory. That broke the patch on kernels older than 4.5, because it didn't permit the patch to apply cleanly for 4.4 and older.
A similar patch was applied to the 3.16 kernel already, but not to 4.4. Commit 4b911101a5cd ("drm/omap: fix max fclk divider for omap36xx") is on the 3.16 stable branch with notes from Ben about the path change.
Since this was applied for 3.16 already, this patch is for kernels 3.17 through 4.4 only.
Fixes: f7018c213502 ("video: move fbdev to drivers/video/fbdev") Cc: stable@vger.kernel.org #3.17 - 4.4 CC: tomi.valkeinen@ti.com Signed-off-by: Adam Ford aford173@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/video/fbdev/omap2/dss/dss.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/video/fbdev/omap2/dss/dss.c +++ b/drivers/video/fbdev/omap2/dss/dss.c @@ -843,7 +843,7 @@ static const struct dss_features omap34x };
static const struct dss_features omap3630_dss_feats = { - .fck_div_max = 32, + .fck_div_max = 31, .dss_fck_multiplier = 1, .parent_clk_name = "dpll4_ck", .dpi_select_source = &dss_dpi_select_source_omap2_omap3,
On Mon, 24 Aug 2020 10:30:56 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.234 release. There are 33 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 26 Aug 2020 08:23:34 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.234-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.4: 6 builds: 6 pass, 0 fail 12 boots: 12 pass, 0 fail 28 tests: 28 pass, 0 fail
Linux version: 4.4.234-rc1-g00117144aa83 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
Hi!
Responses should be made by Wed, 26 Aug 2020 08:23:34 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.234-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
I believe this was tested successfully with CIP project:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/18...
Yes, they are failures, but they are DNS resolution problems during testing.
Best regards, Pavel
linux-stable-mirror@lists.linaro.org