November 2021 - Linux-stable-mirror

stable-rc/queue/4.9 baseline: 138 runs, 1 regressions (v4.9.289-5-gd40c0701ae4c)

by kernelci.org bot

stable-rc/queue/4.9 baseline: 138 runs, 1 regressions (v4.9.289-5-gd40c0701ae4c) Regressions Summary ------------------- platform | arch | lab | compiler | defconfig | regressions ---------+------+---------------+----------+---------------------+------------ panda | arm | lab-collabora | gcc-10 | omap2plus_defconfig | 1 Details: https://kernelci.org/test/job/stable-rc/branch/queue%2F4.9/kernel/v4.9.289-… Test: baseline Tree: stable-rc Branch: queue/4.9 Describe: v4.9.289-5-gd40c0701ae4c URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git SHA: d40c0701ae4cb5ab84408e8938750ffd652a9821 Test Regressions ---------------- platform | arch | lab | compiler | defconfig | regressions ---------+------+---------------+----------+---------------------+------------ panda | arm | lab-collabora | gcc-10 | omap2plus_defconfig | 1 Details: https://kernelci.org/test/plan/id/61856cedb86e88935f3358dc Results: 4 PASS, 1 FAIL, 1 SKIP Full config: omap2plus_defconfig Compiler: gcc-10 (arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110) Plain log: https://storage.kernelci.org//stable-rc/queue-4.9/v4.9.289-5-gd40c0701ae4c/… HTML log: https://storage.kernelci.org//stable-rc/queue-4.9/v4.9.289-5-gd40c0701ae4c/… Rootfs: http://storage.kernelci.org/images/rootfs/buildroot/kci-2020.05-6-g8983f3b7… * baseline.dmesg.emerg: https://kernelci.org/test/case/id/61856cedb86e88935f3358df failing since 4 days (last pass: v4.9.288-20-g1a0db32ed119, first fail: v4.9.288-20-g7720b834ad25) 2 lines 2021-11-05T17:41:52.536916 [ 20.591583] <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=alert RESULT=pass UNITS=lines MEASUREMENT=0> 2021-11-05T17:41:52.587750 kern :emerg : BUG: spinlock bad magic on CPU#0, udevd/118 2021-11-05T17:41:52.597955 kern :emerg : lock: emif_lock+0x0/0xfffff230 [emif], .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 2021-11-05T17:41:52.617698 [ 20.675109] <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=emerg RESULT=fail UNITS=lines MEASUREMENT=2>

3 years, 11 months

1
0
0 0

[PATCH v2] staging/fbtft: Fix backlight

by Noralf Trønnes

Commit b4a1ed0cd18b ("fbdev: make FB_BACKLIGHT a tristate") forgot to update fbtft breaking its backlight support when FB_BACKLIGHT is a module. Since FB_TFT selects FB_BACKLIGHT there's no need for this conditional so just remove it and we're good. Fixes: b4a1ed0cd18b ("fbdev: make FB_BACKLIGHT a tristate") Cc: <stable(a)vger.kernel.org> Signed-off-by: Noralf Trønnes <noralf(a)tronnes.org> --- Changes since v1: - No need for the #ifdef at all since FB_BACKLIGHT is always selected - Fix fb_ssd1351 as well fb_watterott has the same problem, but I've sent a removal patch for that one. drivers/staging/fbtft/fb_ssd1351.c | 4 ---- drivers/staging/fbtft/fbtft-core.c | 9 +-------- 2 files changed, 1 insertion(+), 12 deletions(-) diff --git a/drivers/staging/fbtft/fb_ssd1351.c b/drivers/staging/fbtft/fb_ssd1351.c index cf263a58a148..6fd549a424d5 100644 --- a/drivers/staging/fbtft/fb_ssd1351.c +++ b/drivers/staging/fbtft/fb_ssd1351.c @@ -187,7 +187,6 @@ static struct fbtft_display display = { }, }; -#ifdef CONFIG_FB_BACKLIGHT static int update_onboard_backlight(struct backlight_device *bd) { struct fbtft_par *par = bl_get_data(bd); @@ -231,9 +230,6 @@ static void register_onboard_backlight(struct fbtft_par *par) if (!par->fbtftops.unregister_backlight) par->fbtftops.unregister_backlight = fbtft_unregister_backlight; } -#else -static void register_onboard_backlight(struct fbtft_par *par) { }; -#endif FBTFT_REGISTER_DRIVER(DRVNAME, "solomon,ssd1351", &display); diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c index ed992ca605eb..1690358b8f01 100644 --- a/drivers/staging/fbtft/fbtft-core.c +++ b/drivers/staging/fbtft/fbtft-core.c @@ -128,7 +128,6 @@ static int fbtft_request_gpios(struct fbtft_par *par) return 0; } -#ifdef CONFIG_FB_BACKLIGHT static int fbtft_backlight_update_status(struct backlight_device *bd) { struct fbtft_par *par = bl_get_data(bd); @@ -161,6 +160,7 @@ void fbtft_unregister_backlight(struct fbtft_par *par) par->info->bl_dev = NULL; } } +EXPORT_SYMBOL(fbtft_unregister_backlight); static const struct backlight_ops fbtft_bl_ops = { .get_brightness = fbtft_backlight_get_brightness, @@ -198,12 +198,7 @@ void fbtft_register_backlight(struct fbtft_par *par) if (!par->fbtftops.unregister_backlight) par->fbtftops.unregister_backlight = fbtft_unregister_backlight; } -#else -void fbtft_register_backlight(struct fbtft_par *par) { }; -void fbtft_unregister_backlight(struct fbtft_par *par) { }; -#endif EXPORT_SYMBOL(fbtft_register_backlight); -EXPORT_SYMBOL(fbtft_unregister_backlight); static void fbtft_set_addr_win(struct fbtft_par *par, int xs, int ys, int xe, int ye) @@ -853,13 +848,11 @@ int fbtft_register_framebuffer(struct fb_info *fb_info) fb_info->fix.smem_len >> 10, text1, HZ / fb_info->fbdefio->delay, text2); -#ifdef CONFIG_FB_BACKLIGHT /* Turn on backlight if available */ if (fb_info->bl_dev) { fb_info->bl_dev->props.power = FB_BLANK_UNBLANK; fb_info->bl_dev->ops->update_status(fb_info->bl_dev); } -#endif return 0; -- 2.33.0

3 years, 11 months

1
0
0 0

[patch 174/262] mm, thp: fix incorrect unmap behavior for private pages

by Andrew Morton

From: Rongwei Wang <rongwei.wang(a)linux.alibaba.com> Subject: mm, thp: fix incorrect unmap behavior for private pages When truncating pagecache on file THP, the private pages of a process should not be unmapped mapping. This incorrect behavior on a dynamic shared libraries which will cause related processes to happen core dump. A simple test for a DSO (Prerequisite is the DSO mapped in file THP): int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); } close(fd); return 0; } The test only to open a target DSO, and do nothing. But this operation will lead one or more process to happen core dump. This patch mainly to fix this bug. Link: https://lkml.kernel.org/r/20211025092134.18562-3-rongwei.wang@linux.alibaba… Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by: Rongwei Wang <rongwei.wang(a)linux.alibaba.com> Tested-by: Xu Yu <xuyu(a)linux.alibaba.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Song Liu <song(a)kernel.org> Cc: William Kucharski <william.kucharski(a)oracle.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Yang Shi <shy828301(a)gmail.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: Collin Fijalkovich <cfijalkovich(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/open.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/fs/open.c~mm-thp-fix-incorrect-unmap-behavior-for-private-pages +++ a/fs/open.c @@ -857,8 +857,17 @@ static int do_dentry_open(struct file *f */ smp_mb(); if (filemap_nr_thps(inode->i_mapping)) { + struct address_space *mapping = inode->i_mapping; + filemap_invalidate_lock(inode->i_mapping); - truncate_pagecache(inode, 0); + /* + * unmap_mapping_range just need to be called once + * here, because the private pages is not need to be + * unmapped mapping (e.g. data segment of dynamic + * shared libraries here). + */ + unmap_mapping_range(mapping, 0, 0, 0); + truncate_inode_pages(mapping, 0); filemap_invalidate_unlock(inode->i_mapping); } } _

3 years, 11 months

1
0
0 0

[patch 173/262] mm, thp: lock filemap when truncating page cache

by Andrew Morton

From: Rongwei Wang <rongwei.wang(a)linux.alibaba.com> Subject: mm, thp: lock filemap when truncating page cache Patch series "fix two bugs for file THP". This patch (of 2): Transparent huge page has supported read-only non-shmem files. The file- backed THP is collapsed by khugepaged and truncated when written (for shared libraries). However, there is a race when multiple writers truncate the same page cache concurrently. In that case, subpage(s) of file THP can be revealed by find_get_entry in truncate_inode_pages_range, which will trigger PageTail BUG_ON in truncate_inode_page, as follows. page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0 flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head) raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000 head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000 page dumped because: VM_BUG_ON_PAGE(PageTail(page)) ------------[ cut here ]------------ kernel BUG at mm/truncate.c:213! Internal error: Oops - BUG: 0 [#1] SMP Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ... CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ... Hardware name: ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) pc : truncate_inode_page+0x64/0x70 lr : truncate_inode_page+0x64/0x70 sp : ffff80001b60b900 x29: ffff80001b60b900 x28: 00000000000007ff x27: ffff80001b60b9a0 x26: 0000000000000000 x25: 000000000000000f x24: ffff80001b60b9a0 x23: ffff80001b60ba18 x22: ffff0001e0999ea8 x21: ffff0000c21db300 x20: ffffffffffffffff x19: fffffe001310ffc0 x18: 0000000000000020 x17: 0000000000000000 x16: 0000000000000000 x15: ffff0000c21db960 x14: 3030306666666620 x13: 6666666666666666 x12: 3130303030303030 x11: ffff8000117b69b8 x10: 00000000ffff8000 x9 : ffff80001012690c x8 : 0000000000000000 x7 : ffff8000114f69b8 x6 : 0000000000017ffd x5 : ffff0007fffbcbc8 x4 : ffff80001b60b5c0 x3 : 0000000000000001 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: truncate_inode_page+0x64/0x70 truncate_inode_pages_range+0x550/0x7e4 truncate_pagecache+0x58/0x80 do_dentry_open+0x1e4/0x3c0 vfs_open+0x38/0x44 do_open+0x1f0/0x310 path_openat+0x114/0x1dc do_filp_open+0x84/0x134 do_sys_openat2+0xbc/0x164 __arm64_sys_openat+0x74/0xc0 el0_svc_common.constprop.0+0x88/0x220 do_el0_svc+0x30/0xa0 el0_svc+0x20/0x30 el0_sync_handler+0x1a4/0x1b0 el0_sync+0x180/0x1c0 Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000) ---[ end trace f70cdb42cb7c2d42 ]--- Kernel panic - not syncing: Oops - BUG: Fatal exception This patch mainly to lock filemap when one enter truncate_pagecache(), avoiding truncating the same page cache concurrently. Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba… Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba… Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs") Signed-off-by: Xu Yu <xuyu(a)linux.alibaba.com> Signed-off-by: Rongwei Wang <rongwei.wang(a)linux.alibaba.com> Suggested-by: Matthew Wilcox (Oracle) <willy(a)infradead.org> Tested-by: Song Liu <song(a)kernel.org> Cc: Collin Fijalkovich <cfijalkovich(a)google.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: William Kucharski <william.kucharski(a)oracle.com> Cc: Yang Shi <shy828301(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/open.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/fs/open.c~mm-thp-lock-filemap-when-truncating-page-cache +++ a/fs/open.c @@ -856,8 +856,11 @@ static int do_dentry_open(struct file *f * of THPs into the page cache will fail. */ smp_mb(); - if (filemap_nr_thps(inode->i_mapping)) + if (filemap_nr_thps(inode->i_mapping)) { + filemap_invalidate_lock(inode->i_mapping); truncate_pagecache(inode, 0); + filemap_invalidate_unlock(inode->i_mapping); + } } return 0; _

3 years, 11 months

1
0
0 0

[patch 067/262] memcg: prohibit unconditional exceeding the limit of dying tasks

by Andrew Morton

From: Vasily Averin <vvs(a)virtuozzo.com> Subject: memcg: prohibit unconditional exceeding the limit of dying tasks Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It is assumed that the amount of the memory charged by those tasks is bound and most of the memory will get released while the task is exiting. This is resembling a heuristic for the global OOM situation when tasks get access to memory reserves. There is no global memory shortage at the memcg level so the memcg heuristic is more relieved. The above assumption is overly optimistic though. E.g. vmalloc can scale to really large requests and the heuristic would allow that. We used to have an early break in the vmalloc allocator for killed tasks but this has been reverted by commit b8c8a338f75e ("Revert "vmalloc: back off when the current task is killed""). There are likely other similar code paths which do not check for fatal signals in an allocation&charge loop. Also there are some kernel objects charged to a memcg which are not bound to a process life time. It has been observed that it is not really hard to trigger these bypasses and cause global OOM situation. One potential way to address these runaways would be to limit the amount of excess (similar to the global OOM with limited oom reserves). This is certainly possible but it is not really clear how much of an excess is desirable and still protects from global OOMs as that would have to consider the overall memcg configuration. This patch is addressing the problem by removing the heuristic altogether. Bypass is only allowed for requests which either cannot fail or where the failure is not desirable while excess should be still limited (e.g. atomic requests). Implementation wise a killed or dying task fails to charge if it has passed the OOM killer stage. That should give all forms of reclaim chance to restore the limit before the failure (ENOMEM) and tell the caller to back off. In addition, this patch renames should_force_charge() helper to task_is_dying() because now its use is not associated witch forced charging. This patch depends on pagefault_out_of_memory() to not trigger out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM and cause a global OOM killer. Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com> Suggested-by: Michal Hocko <mhocko(a)suse.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: Roman Gushchin <guro(a)fb.com> Cc: Uladzislau Rezki <urezki(a)gmail.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: Shakeel Butt <shakeelb(a)google.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memcontrol.c | 27 ++++++++------------------- 1 file changed, 8 insertions(+), 19 deletions(-) --- a/mm/memcontrol.c~memcg-prohibit-unconditional-exceeding-the-limit-of-dying-tasks +++ a/mm/memcontrol.c @@ -234,7 +234,7 @@ enum res_type { iter != NULL; \ iter = mem_cgroup_iter(NULL, iter, NULL)) -static inline bool should_force_charge(void) +static inline bool task_is_dying(void) { return tsk_is_oom_victim(current) || fatal_signal_pending(current) || (current->flags & PF_EXITING); @@ -1624,7 +1624,7 @@ static bool mem_cgroup_out_of_memory(str * A few threads which were not waiting at mutex_lock_killable() can * fail to bail out. Therefore, check again after holding oom_lock. */ - ret = should_force_charge() || out_of_memory(&oc); + ret = task_is_dying() || out_of_memory(&oc); unlock: mutex_unlock(&oom_lock); @@ -2579,6 +2579,7 @@ static int try_charge_memcg(struct mem_c struct page_counter *counter; enum oom_status oom_status; unsigned long nr_reclaimed; + bool passed_oom = false; bool may_swap = true; bool drained = false; unsigned long pflags; @@ -2614,15 +2615,6 @@ retry: goto force; /* - * Unlike in global OOM situations, memcg is not in a physical - * memory shortage. Allow dying and OOM-killed tasks to - * bypass the last charges so that they can exit quickly and - * free their memory. - */ - if (unlikely(should_force_charge())) - goto force; - - /* * Prevent unbounded recursion when reclaim operations need to * allocate memory. This might exceed the limits temporarily, * but we prefer facilitating memory reclaim and getting back @@ -2679,8 +2671,9 @@ retry: if (gfp_mask & __GFP_RETRY_MAYFAIL) goto nomem; - if (fatal_signal_pending(current)) - goto force; + /* Avoid endless loop for tasks bypassed by the oom killer */ + if (passed_oom && task_is_dying()) + goto nomem; /* * keep retrying as long as the memcg oom killer is able to make @@ -2689,14 +2682,10 @@ retry: */ oom_status = mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(nr_pages * PAGE_SIZE)); - switch (oom_status) { - case OOM_SUCCESS: + if (oom_status == OOM_SUCCESS) { + passed_oom = true; nr_retries = MAX_RECLAIM_RETRIES; goto retry; - case OOM_FAILED: - goto force; - default: - goto nomem; } nomem: if (!(gfp_mask & __GFP_NOFAIL)) _

3 years, 11 months

1
0
0 0

[patch 066/262] mm, oom: do not trigger out_of_memory from the #PF

by Andrew Morton

From: Michal Hocko <mhocko(a)suse.com> Subject: mm, oom: do not trigger out_of_memory from the #PF Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory. This can happen for 2 different reasons. a) Memcg is out of memory and we rely on mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) normal allocation fails. The latter is quite problematic because allocation paths already trigger out_of_memory and the page allocator tries really hard to not fail allocations. Anyway, if the OOM killer has been already invoked there is no reason to invoke it again from the #PF path. Especially when the OOM condition might be gone by that time and we have no way to find out other than allocate. Moreover if the allocation failed and the OOM killer hasn't been invoked then we are unlikely to do the right thing from the #PF context because we have already lost the allocation context and restictions and therefore might oom kill a task from a different NUMA domain. This all suggests that there is no legitimate reason to trigger out_of_memory from pagefault_out_of_memory so drop it. Just to be sure that no #PF path returns with VM_FAULT_OOM without allocation print a warning that this is happening before we restart the #PF. [VvS: #PF allocation can hit into limit of cgroup v1 kmem controller. This is a local problem related to memcg, however, it causes unnecessary global OOM kills that are repeated over and over again and escalate into a real disaster. This has been broken since kmem accounting has been introduced for cgroup v1 (3.8). There was no kmem specific reclaim for the separate limit so the only way to handle kmem hard limit was to return with ENOMEM. In upstream the problem will be fixed by removing the outdated kmem limit, however stable and LTS kernels cannot do it and are still affected. This patch fixes the problem and should be backported into stable/LTS.] Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com Signed-off-by: Michal Hocko <mhocko(a)suse.com> Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Roman Gushchin <guro(a)fb.com> Cc: Shakeel Butt <shakeelb(a)google.com> Cc: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp> Cc: Uladzislau Rezki <urezki(a)gmail.com> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/oom_kill.c | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) --- a/mm/oom_kill.c~mm-oom-do-not-trigger-out_of_memory-from-the-pf +++ a/mm/oom_kill.c @@ -1120,19 +1120,15 @@ bool out_of_memory(struct oom_control *o } /* - * The pagefault handler calls here because it is out of memory, so kill a - * memory-hogging task. If oom_lock is held by somebody else, a parallel oom - * killing is already in progress so do nothing. + * The pagefault handler calls here because some allocation has failed. We have + * to take care of the memcg OOM here because this is the only safe context without + * any locks held but let the oom killer triggered from the allocation context care + * about the global OOM. */ void pagefault_out_of_memory(void) { - struct oom_control oc = { - .zonelist = NULL, - .nodemask = NULL, - .memcg = NULL, - .gfp_mask = 0, - .order = 0, - }; + static DEFINE_RATELIMIT_STATE(pfoom_rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); if (mem_cgroup_oom_synchronize(true)) return; @@ -1140,10 +1136,8 @@ void pagefault_out_of_memory(void) if (fatal_signal_pending(current)) return; - if (!mutex_trylock(&oom_lock)) - return; - out_of_memory(&oc); - mutex_unlock(&oom_lock); + if (__ratelimit(&pfoom_rs)) + pr_warn("Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF\n"); } SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) _

3 years, 11 months

1
0
0 0

[patch 065/262] mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks

by Andrew Morton

From: Vasily Averin <vvs(a)virtuozzo.com> Subject: mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It can be misused and allowed to trigger global OOM from inside a memcg-limited container. On the other hand if memcg fails allocation, called from inside #PF handler it triggers global OOM from inside pagefault_out_of_memory(). To prevent these problems this patchset: a) removes execution of out_of_memory() from pagefault_out_of_memory(), becasue nobody can explain why it is necessary. b) allow memcg to fail allocation of dying/killed tasks. This patch (of 3): Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory which in turn executes out_out_memory() and can kill a random task. An allocation might fail when the current task is the oom victim and there are no memory reserves left. The OOM killer is already handled at the page allocator level for the global OOM and at the charging level for the memcg one. Both have much more information about the scope of allocation/charge request. This means that either the OOM killer has been invoked properly and didn't lead to the allocation success or it has been skipped because it couldn't have been invoked. In both cases triggering it from here is pointless and even harmful. It makes much more sense to let the killed task die rather than to wake up an eternally hungry oom-killer and send him to choose a fatter victim for breakfast. Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com> Suggested-by: Michal Hocko <mhocko(a)suse.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Roman Gushchin <guro(a)fb.com> Cc: Shakeel Butt <shakeelb(a)google.com> Cc: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp> Cc: Uladzislau Rezki <urezki(a)gmail.com> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/oom_kill.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/oom_kill.c~mm-oom-pagefault_out_of_memory-dont-force-global-oom-for-dying-tasks +++ a/mm/oom_kill.c @@ -1137,6 +1137,9 @@ void pagefault_out_of_memory(void) if (mem_cgroup_oom_synchronize(true)) return; + if (fatal_signal_pending(current)) + return; + if (!mutex_trylock(&oom_lock)) return; out_of_memory(&oc); _

3 years, 11 months

1
0
0 0

[patch 048/262] mm/filemap.c: remove bogus VM_BUG_ON

by Andrew Morton

From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org> Subject: mm/filemap.c: remove bogus VM_BUG_ON It is not safe to check page->index without holding the page lock. It can be changed if the page is moved between the swap cache and the page cache for a shmem file, for example. There is a VM_BUG_ON below which checks page->index is correct after taking the page lock. Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org Fixes: 5c211ba29deb ("mm: add and use find_lock_entries") Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org> Reported-by: <syzbot+c87be4f669d920c76330(a)syzkaller.appspotmail.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/filemap.c | 1 - 1 file changed, 1 deletion(-) --- a/mm/filemap.c~mm-remove-bogus-vm_bug_on +++ a/mm/filemap.c @@ -2093,7 +2093,6 @@ unsigned find_lock_entries(struct addres if (!xa_is_value(page)) { if (page->index < start) goto put; - VM_BUG_ON_PAGE(page->index != xas.xa_index, page); if (page->index + thp_nr_pages(page) - 1 > end) goto put; if (!trylock_page(page)) _

3 years, 11 months

1
0
0 0

[patch 007/262] ocfs2: fix data corruption on truncate

by Andrew Morton

From: Jan Kara <jack(a)suse.cz> Subject: ocfs2: fix data corruption on truncate Patch series "ocfs2: Truncate data corruption fix". As further testing has shown, commit 5314454ea3f ("ocfs2: fix data corruption after conversion from inline format") didn't fix all the data corruption issues the customer started observing after 6dbf7bb55598 ("fs: Don't invalidate page buffers in block_write_full_page()") This time I have tracked them down to two bugs in ocfs2 truncation code. One bug (truncating page cache before clearing tail cluster and setting i_size) could cause data corruption even before 6dbf7bb55598, but before that commit it needed a race with page fault, after 6dbf7bb55598 it started to be pretty deterministic. Another bug (zeroing pages beyond old i_size) used to be harmless inefficiency before commit 6dbf7bb55598. But after commit 6dbf7bb55598 in combination with the first bug it resulted in deterministic data corruption. Although fixing only the first problem is needed to stop data corruption, I've fixed both issues to make the code more robust. This patch (of 2): ocfs2_truncate_file() did unmap invalidate page cache pages before zeroing partial tail cluster and setting i_size. Thus some pages could be left (and likely have left if the cluster zeroing happened) in the page cache beyond i_size after truncate finished letting user possibly see stale data once the file was extended again. Also the tail cluster zeroing was not guaranteed to finish before truncate finished causing possible stale data exposure. The problem started to be particularly easy to hit after commit 6dbf7bb55598 "fs: Don't invalidate page buffers in block_write_full_page()" stopped invalidation of pages beyond i_size from page writeback path. Fix these problems by unmapping and invalidating pages in the page cache after the i_size is reduced and tail cluster is zeroed out. Link: https://lkml.kernel.org/r/20211025150008.29002-1-jack@suse.cz Link: https://lkml.kernel.org/r/20211025151332.11301-1-jack@suse.cz Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem") Signed-off-by: Jan Kara <jack(a)suse.cz> Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Junxiao Bi <junxiao.bi(a)oracle.com> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Gang He <ghe(a)suse.com> Cc: Jun Piao <piaojun(a)huawei.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/file.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/fs/ocfs2/file.c~ocfs2-fix-data-corruption-on-truncate +++ a/fs/ocfs2/file.c @@ -476,10 +476,11 @@ int ocfs2_truncate_file(struct inode *in * greater than page size, so we have to truncate them * anyway. */ - unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1); - truncate_inode_pages(inode->i_mapping, new_i_size); if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) { + unmap_mapping_range(inode->i_mapping, + new_i_size + PAGE_SIZE - 1, 0, 1); + truncate_inode_pages(inode->i_mapping, new_i_size); status = ocfs2_truncate_inline(inode, di_bh, new_i_size, i_size_read(inode), 1); if (status) @@ -498,6 +499,9 @@ int ocfs2_truncate_file(struct inode *in goto bail_unlock_sem; } + unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1); + truncate_inode_pages(inode->i_mapping, new_i_size); + status = ocfs2_commit_truncate(osb, inode, di_bh); if (status < 0) { mlog_errno(status); _

3 years, 11 months

1
0
0 0

[PATCH] staging/fbtft: Fix backlight

by Noralf Trønnes

Commit b4a1ed0cd18b ("fbdev: make FB_BACKLIGHT a tristate") forgot to update fbtft breaking its backlight support when FB_BACKLIGHT is a module. Fix this by using IS_ENABLED(). Fixes: b4a1ed0cd18b ("fbdev: make FB_BACKLIGHT a tristate") Cc: <stable(a)vger.kernel.org> Signed-off-by: Noralf Trønnes <noralf(a)tronnes.org> --- drivers/staging/fbtft/fbtft-core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/fbtft/fbtft-core.c b/drivers/staging/fbtft/fbtft-core.c index ed992ca605eb..ff979f1eeee8 100644 --- a/drivers/staging/fbtft/fbtft-core.c +++ b/drivers/staging/fbtft/fbtft-core.c @@ -128,7 +128,7 @@ static int fbtft_request_gpios(struct fbtft_par *par) return 0; } -#ifdef CONFIG_FB_BACKLIGHT +#if IS_ENABLED(CONFIG_FB_BACKLIGHT) static int fbtft_backlight_update_status(struct backlight_device *bd) { struct fbtft_par *par = bl_get_data(bd); @@ -853,7 +853,7 @@ int fbtft_register_framebuffer(struct fb_info *fb_info) fb_info->fix.smem_len >> 10, text1, HZ / fb_info->fbdefio->delay, text2); -#ifdef CONFIG_FB_BACKLIGHT +#if IS_ENABLED(CONFIG_FB_BACKLIGHT) /* Turn on backlight if available */ if (fb_info->bl_dev) { fb_info->bl_dev->props.power = FB_BLANK_UNBLANK; -- 2.33.0

3 years, 11 months

1
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror November 2021