June 2020 - Linux-stable-mirror

[merged] mm-memcontrol-fix-do-not-put-the-css-reference.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: mm/memcontrol.c: add missed css_put() has been removed from the -mm tree. Its filename was mm-memcontrol-fix-do-not-put-the-css-reference.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Muchun Song <songmuchun(a)bytedance.com> Subject: mm/memcontrol.c: add missed css_put() We should put the css reference when memory allocation failed. Link: http://lkml.kernel.org/r/20200614122653.98829-1-songmuchun@bytedance.com Fixes: f0a3a24b532d ("mm: memcg/slab: rework non-root kmem_cache lifecycle management") Signed-off-by: Muchun Song <songmuchun(a)bytedance.com> Acked-by: Roman Gushchin <guro(a)fb.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: Qian Cai <cai(a)lca.pw> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memcontrol.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/mm/memcontrol.c~mm-memcontrol-fix-do-not-put-the-css-reference +++ a/mm/memcontrol.c @@ -2772,8 +2772,10 @@ static void memcg_schedule_kmem_cache_cr return; cw = kmalloc(sizeof(*cw), GFP_NOWAIT | __GFP_NOWARN); - if (!cw) + if (!cw) { + css_put(&memcg->css); return; + } cw->memcg = memcg; cw->cachep = cachep; _ Patches currently in -mm which might be from songmuchun(a)bytedance.com are

5 years

1
0
0 0

[merged] mm-memcontrol-handle-div0-crash-race-condition-in-memorylow.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: mm: memcontrol: handle div0 crash race condition in memory.low has been removed from the -mm tree. Its filename was mm-memcontrol-handle-div0-crash-race-condition-in-memorylow.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Johannes Weiner <hannes(a)cmpxchg.org> Subject: mm: memcontrol: handle div0 crash race condition in memory.low Tejun reports seeing rare div0 crashes in memory.low stress testing: [37228.504582] RIP: 0010:mem_cgroup_calculate_protection+0xed/0x150 [37228.505059] Code: 0f 46 d1 4c 39 d8 72 57 f6 05 16 d6 42 01 40 74 1f 4c 39 d8 76 1a 4c 39 d1 76 15 4c 29 d1 4c 29 d8 4d 29 d9 31 d2 48 0f af c1 <49> f7 f1 49 01 c2 4c 89 96 38 01 00 00 5d c3 48 0f af c7 31 d2 49 [37228.506254] RSP: 0018:ffffa14e01d6fcd0 EFLAGS: 00010246 [37228.506769] RAX: 000000000243e384 RBX: 0000000000000000 RCX: 0000000000008f4b [37228.507319] RDX: 0000000000000000 RSI: ffff8b89bee84000 RDI: 0000000000000000 [37228.507869] RBP: ffffa14e01d6fcd0 R08: ffff8b89ca7d40f8 R09: 0000000000000000 [37228.508376] R10: 0000000000000000 R11: 00000000006422f7 R12: 0000000000000000 [37228.508881] R13: ffff8b89d9617000 R14: ffff8b89bee84000 R15: ffffa14e01d6fdb8 [37228.509397] FS: 0000000000000000(0000) GS:ffff8b8a1f1c0000(0000) knlGS:0000000000000000 [37228.509917] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [37228.510442] CR2: 00007f93b1fc175b CR3: 000000016100a000 CR4: 0000000000340ea0 [37228.511076] Call Trace: [37228.511561] shrink_node+0x1e5/0x6c0 [37228.512044] balance_pgdat+0x32d/0x5f0 [37228.512521] kswapd+0x1d7/0x3d0 [37228.513346] ? wait_woken+0x80/0x80 [37228.514170] kthread+0x11c/0x160 [37228.514983] ? balance_pgdat+0x5f0/0x5f0 [37228.515797] ? kthread_park+0x90/0x90 [37228.516593] ret_from_fork+0x1f/0x30 This happens when parent_usage == siblings_protected. We check that usage is bigger than protected, which should imply parent_usage being bigger than siblings_protected. However, we don't read (or even update) these values atomically, and they can be out of sync as the memory state changes under us. A bit of fluctuation around the target protection isn't a big deal, but we need to handle the div0 case. Check the parent state explicitly to make sure we have a reasonable positive value for the divisor. Link: http://lkml.kernel.org/r/20200615140658.601684-1-hannes@cmpxchg.org Fixes: 8a931f801340 ("mm: memcontrol: recursive memory.low protection") Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org> Reported-by: Tejun Heo <tj(a)kernel.org> Acked-by: Michal Hocko <mhocko(a)suse.com> Acked-by: Chris Down <chris(a)chrisdown.name> Cc: Roman Gushchin <guro(a)fb.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memcontrol.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/mm/memcontrol.c~mm-memcontrol-handle-div0-crash-race-condition-in-memorylow +++ a/mm/memcontrol.c @@ -6360,11 +6360,16 @@ static unsigned long effective_protectio * We're using unprotected memory for the weight so that if * some cgroups DO claim explicit protection, we don't protect * the same bytes twice. + * + * Check both usage and parent_usage against the respective + * protected values. One should imply the other, but they + * aren't read atomically - make sure the division is sane. */ if (!(cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_RECURSIVE_PROT)) return ep; - - if (parent_effective > siblings_protected && usage > protected) { + if (parent_effective > siblings_protected && + parent_usage > siblings_protected && + usage > protected) { unsigned long unclaimed; unclaimed = parent_effective - siblings_protected; _ Patches currently in -mm which might be from hannes(a)cmpxchg.org are mm-memcontrol-decouple-reference-counting-from-page-accounting.patch

5 years

1
0
0 0

[merged] mm-fix-swap-cache-node-allocation-mask.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: mm: fix swap cache node allocation mask has been removed from the -mm tree. Its filename was mm-fix-swap-cache-node-allocation-mask.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm: fix swap cache node allocation mask https://bugzilla.kernel.org/show_bug.cgi?id=208085 reports that a slightly overcommitted load, testing swap and zram along with i915, splats and keeps on splatting, when it had better fail less noisily: gnome-shell: page allocation failure: order:0, mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 2 PID: 1155 Comm: gnome-shell Not tainted 5.7.0-1.fc33.x86_64 #1 Call Trace: dump_stack+0x64/0x88 warn_alloc.cold+0x75/0xd9 __alloc_pages_slowpath.constprop.0+0xcfa/0xd30 __alloc_pages_nodemask+0x2df/0x320 alloc_slab_page+0x195/0x310 allocate_slab+0x3c5/0x440 ___slab_alloc+0x40c/0x5f0 __slab_alloc+0x1c/0x30 kmem_cache_alloc+0x20e/0x220 xas_nomem+0x28/0x70 add_to_swap_cache+0x321/0x400 __read_swap_cache_async+0x105/0x240 swap_cluster_readahead+0x22c/0x2e0 shmem_swapin+0x8e/0xc0 shmem_swapin_page+0x196/0x740 shmem_getpage_gfp+0x3a2/0xa60 shmem_read_mapping_page_gfp+0x32/0x60 shmem_get_pages+0x155/0x5e0 [i915] __i915_gem_object_get_pages+0x68/0xa0 [i915] i915_vma_pin+0x3fe/0x6c0 [i915] eb_add_vma+0x10b/0x2c0 [i915] i915_gem_do_execbuffer+0x704/0x3430 [i915] i915_gem_execbuffer2_ioctl+0x1ea/0x3e0 [i915] drm_ioctl_kernel+0x86/0xd0 [drm] drm_ioctl+0x206/0x390 [drm] ksys_ioctl+0x82/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5b/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported on 5.7, but it goes back really to 3.1: when shmem_read_mapping_page_gfp() was implemented for use by i915, and allowed for __GFP_NORETRY and __GFP_NOWARN flags in most places, but missed swapin's "& GFP_KERNEL" mask for page tree node allocation in __read_swap_cache_async() - that was to mask off HIGHUSER_MOVABLE bits from what page cache uses, but GFP_RECLAIM_MASK is now what's needed. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2006151330070.11064@eggly.anvils Fixes: 68da9f055755 ("tmpfs: pass gfp to shmem_getpage_gfp") Signed-off-by: Hugh Dickins <hughd(a)google.com> Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz> Reviewed-by: Matthew Wilcox (Oracle) <willy(a)infradead.org> Reported-by: Chris Murphy <lists(a)colorremedies.com> Analyzed-by: Vlastimil Babka <vbabka(a)suse.cz> Analyzed-by: Matthew Wilcox <willy(a)infradead.org> Tested-by: Chris Murphy <lists(a)colorremedies.com> Cc: <stable(a)vger.kernel.org> [3.1+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/swap_state.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/swap_state.c~mm-fix-swap-cache-node-allocation-mask +++ a/mm/swap_state.c @@ -21,7 +21,7 @@ #include <linux/vmalloc.h> #include <linux/swap_slots.h> #include <linux/huge_mm.h> - +#include "internal.h" /* * swapper_space is a fiction, retained to simplify the path through @@ -429,7 +429,7 @@ struct page *__read_swap_cache_async(swp __SetPageSwapBacked(page); /* May fail (-ENOMEM) if XArray node allocation failed. */ - if (add_to_swap_cache(page, entry, gfp_mask & GFP_KERNEL)) { + if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK)) { put_swap_page(page, entry); goto fail_unlock; } _ Patches currently in -mm which might be from hughd(a)google.com are mm-vmstat-add-events-for-pmd-based-thp-migration-without-split-fix.patch

5 years

1
0
0 0

[merged] mm-slab-use-memzero_explicit-in-kzfree.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: mm/slab: use memzero_explicit() in kzfree() has been removed from the -mm tree. Its filename was mm-slab-use-memzero_explicit-in-kzfree.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Waiman Long <longman(a)redhat.com> Subject: mm/slab: use memzero_explicit() in kzfree() The kzfree() function is normally used to clear some sensitive information, like encryption keys, in the buffer before freeing it back to the pool. Memset() is currently used for buffer clearing. However unlikely, there is still a non-zero probability that the compiler may choose to optimize away the memory clearing especially if LTO is being used in the future. To make sure that this optimization will never happen, memzero_explicit(), which is introduced in v3.18, is now used in kzfree() to future-proof it. Link: http://lkml.kernel.org/r/20200616154311.12314-2-longman@redhat.com Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()") Signed-off-by: Waiman Long <longman(a)redhat.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: David Howells <dhowells(a)redhat.com> Cc: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com> Cc: James Morris <jmorris(a)namei.org> Cc: "Serge E. Hallyn" <serge(a)hallyn.com> Cc: Joe Perches <joe(a)perches.com> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: David Rientjes <rientjes(a)google.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Dan Carpenter <dan.carpenter(a)oracle.com> Cc: "Jason A . Donenfeld" <Jason(a)zx2c4.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/slab_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/slab_common.c~mm-slab-use-memzero_explicit-in-kzfree +++ a/mm/slab_common.c @@ -1726,7 +1726,7 @@ void kzfree(const void *p) if (unlikely(ZERO_OR_NULL_PTR(mem))) return; ks = ksize(mem); - memset(mem, 0, ks); + memzero_explicit(mem, ks); kfree(mem); } EXPORT_SYMBOL(kzfree); _ Patches currently in -mm which might be from longman(a)redhat.com are mm-treewide-rename-kzfree-to-kfree_sensitive.patch sched-mm-optimize-current_gfp_context.patch

5 years

1
0
0 0

[merged] mm-slab-fix-sign-conversion-problem-in-memcg_uncharge_slab.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: mm, slab: fix sign conversion problem in memcg_uncharge_slab() has been removed from the -mm tree. Its filename was mm-slab-fix-sign-conversion-problem-in-memcg_uncharge_slab.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Waiman Long <longman(a)redhat.com> Subject: mm, slab: fix sign conversion problem in memcg_uncharge_slab() It was found that running the LTP test on a PowerPC system could produce erroneous values in /proc/meminfo, like: MemTotal: 531915072 kB MemFree: 507962176 kB MemAvailable: 1100020596352 kB Using bisection, the problem is tracked down to commit 9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()"). In memcg_uncharge_slab() with a "int order" argument: unsigned int nr_pages = 1 << order; : mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages); The mod_lruvec_state() function will eventually call the __mod_zone_page_state() which accepts a long argument. Depending on the compiler and how inlining is done, "-nr_pages" may be treated as a negative number or a very large positive number. Apparently, it was treated as a large positive number in that PowerPC system leading to incorrect stat counts. This problem hasn't been seen in x86-64 yet, perhaps the gcc compiler there has some slight difference in behavior. It is fixed by making nr_pages a signed value. For consistency, a similar change is applied to memcg_charge_slab() as well. Link: http://lkml.kernel.org/r/20200620184719.10994-1-longman@redhat.com Fixes: 9c315e4d7d8c ("mm: memcg/slab: cache page number in memcg_(un)charge_slab()"). Signed-off-by: Waiman Long <longman(a)redhat.com> Acked-by: Roman Gushchin <guro(a)fb.com> Cc: Christoph Lameter <cl(a)linux.com> Cc: Pekka Enberg <penberg(a)kernel.org> Cc: David Rientjes <rientjes(a)google.com> Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com> Cc: Shakeel Butt <shakeelb(a)google.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/slab.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/slab.h~mm-slab-fix-sign-conversion-problem-in-memcg_uncharge_slab +++ a/mm/slab.h @@ -348,7 +348,7 @@ static __always_inline int memcg_charge_ gfp_t gfp, int order, struct kmem_cache *s) { - unsigned int nr_pages = 1 << order; + int nr_pages = 1 << order; struct mem_cgroup *memcg; struct lruvec *lruvec; int ret; @@ -388,7 +388,7 @@ out: static __always_inline void memcg_uncharge_slab(struct page *page, int order, struct kmem_cache *s) { - unsigned int nr_pages = 1 << order; + int nr_pages = 1 << order; struct mem_cgroup *memcg; struct lruvec *lruvec; _ Patches currently in -mm which might be from longman(a)redhat.com are mm-treewide-rename-kzfree-to-kfree_sensitive.patch sched-mm-optimize-current_gfp_context.patch

5 years

1
0
0 0

[merged] ocfs2-fix-value-of-ocfs2_invalid_slot.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: ocfs2: fix value of OCFS2_INVALID_SLOT has been removed from the -mm tree. Its filename was ocfs2-fix-value-of-ocfs2_invalid_slot.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Junxiao Bi <junxiao.bi(a)oracle.com> Subject: ocfs2: fix value of OCFS2_INVALID_SLOT In the ocfs2 disk layout, slot number is 16 bits, but in ocfs2 implementation, slot number is 32 bits. Usually this will not cause any issue, because slot number is converted from u16 to u32, but OCFS2_INVALID_SLOT was defined as -1, when an invalid slot number from disk was obtained, its value was (u16)-1, and it was converted to u32. Then the following checking in get_local_system_inode will be always skipped: static struct inode **get_local_system_inode(struct ocfs2_super *osb, int type, u32 slot) { BUG_ON(slot == OCFS2_INVALID_SLOT); ... } Link: http://lkml.kernel.org/r/20200616183829.87211-5-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com> Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Gang He <ghe(a)suse.com> Cc: Jun Piao <piaojun(a)huawei.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/ocfs2_fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/ocfs2/ocfs2_fs.h~ocfs2-fix-value-of-ocfs2_invalid_slot +++ a/fs/ocfs2/ocfs2_fs.h @@ -290,7 +290,7 @@ #define OCFS2_MAX_SLOTS 255 /* Slot map indicator for an empty slot */ -#define OCFS2_INVALID_SLOT -1 +#define OCFS2_INVALID_SLOT ((u16)-1) #define OCFS2_VOL_UUID_LEN 16 #define OCFS2_MAX_VOL_LABEL_LEN 64 _ Patches currently in -mm which might be from junxiao.bi(a)oracle.com are

5 years

1
0
0 0

[merged] ocfs2-fix-panic-on-nfs-server-over-ocfs2.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: ocfs2: fix panic on nfs server over ocfs2 has been removed from the -mm tree. Its filename was ocfs2-fix-panic-on-nfs-server-over-ocfs2.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Junxiao Bi <junxiao.bi(a)oracle.com> Subject: ocfs2: fix panic on nfs server over ocfs2 The following kernel panic was captured when running nfs server over ocfs2, at that time ocfs2_test_inode_bit() was checking whether one inode locating at "blkno" 5 was valid, that is ocfs2 root inode, its "suballoc_slot" was OCFS2_INVALID_SLOT(65535) and it was allocted from //global_inode_alloc, but here it wrongly assumed that it was got from per slot inode alloctor which would cause array overflow and trigger kernel panic. [430033.469151] BUG: unable to handle kernel paging request at 0000000000001088 [430033.469367] IP: [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0 [430033.469567] PGD 1e06ba067 PUD 1e9e7d067 PMD 0 [430033.469769] Oops: 0002 [#1] SMP [430033.469975] Modules linked in: tun nfsd lockd grace nfs_acl auth_rpcgss ocfs2 xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc sunrpc bridge 8021q mrp garp stp llc bonding dm_round_robin scsi_dh_emc dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 i2c_core lpc_ich mfd_core sg ext4 jbd2 mbcache2 sd_mod ahci libahci lpfc scsi_transport_fc be2net vxlan udp_tunnel ip6_udp_tunnel mpt3sas scsi_transport_sas raid_class crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod [430033.472350] CPU: 6 PID: 24873 Comm: nfsd Not tainted 4.1.12-124.36.1.el6uek.x86_64 #2 [430033.472719] Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 3.87 02/02/2018 [430033.472910] task: ffff88005ae98000 ti: ffff88005ae94000 task.ti: ffff88005ae94000 [430033.473277] RIP: e030:[<ffffffff816f6898>] [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0 [430033.473655] RSP: e02b:ffff88005ae97908 EFLAGS: 00010206 [430033.473850] RAX: ffff88005ae98000 RBX: 0000000000001088 RCX: 0000000000000000 [430033.474205] RDX: 0000000000020000 RSI: 0000000000000009 RDI: 0000000000001088 [430033.474574] RBP: ffff88005ae97928 R08: 0000000000000000 R09: ffff880212878e00 [430033.474938] R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000001088 [430033.475324] R13: ffff8800063c0aa8 R14: ffff8800650c27d0 R15: 000000000000ffff [430033.475721] FS: 0000000000000000(0000) GS:ffff880218180000(0000) knlGS:ffff880218180000 [430033.476199] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [430033.476390] CR2: 0000000000001088 CR3: 00000002033d0000 CR4: 0000000000042660 [430033.476760] Stack: [430033.476942] 0000000000001000 0000000000001088 ffff8800063c0aa8 ffff8800650c27d0 [430033.477329] ffff88005ae97948 ffffffff8122a3de 0000000000000009 ffff8800063c0000 [430033.477718] ffff88005ae979e8 ffffffffc0714e43 ffff88005ae97968 ffff88019de8f958 [430033.478104] Call Trace: [430033.478286] [<ffffffff8122a3de>] igrab+0x1e/0x60 [430033.478494] [<ffffffffc0714e43>] ocfs2_get_system_file_inode+0x63/0x3a0 [ocfs2] [430033.478870] [<ffffffffc06a87df>] ? ocfs2_read_blocks_sync+0x13f/0x3c0 [ocfs2] [430033.479267] [<ffffffffc06ff2d8>] ocfs2_test_inode_bit+0x328/0xa00 [ocfs2] [430033.479498] [<ffffffffc06bef5a>] ocfs2_get_parent+0xba/0x3e0 [ocfs2] [430033.479730] [<ffffffff8129b305>] reconnect_path+0xb5/0x300 [430033.479933] [<ffffffff8129b646>] exportfs_decode_fh+0xf6/0x2b0 [430033.480124] [<ffffffffc0814af0>] ? nfsd_proc_getattr+0xa0/0xa0 [nfsd] [430033.480294] [<ffffffffc081a682>] ? exp_find+0xe2/0x190 [nfsd] [430033.480461] [<ffffffff810e5a7e>] ? irq_get_irq_data+0xe/0x10 [430033.480627] [<ffffffff810ea1a7>] ? __call_rcu_nocb_enqueue+0xd7/0xe0 [430033.480794] [<ffffffff810eb9e8>] ? __call_rcu+0xe8/0x360 [430033.480959] [<ffffffffc0815860>] fh_verify+0x350/0x660 [nfsd] [430033.481134] [<ffffffffc0535076>] ? cache_check+0x56/0x3a0 [sunrpc] [430033.481317] [<ffffffffc0823a4d>] nfsd4_putfh+0x4d/0x60 [nfsd] [430033.481505] [<ffffffffc0826003>] nfsd4_proc_compound+0x3d3/0x6f0 [nfsd] [430033.481730] [<ffffffffc0811f60>] nfsd_dispatch+0xe0/0x290 [nfsd] [430033.481950] [<ffffffffc052b752>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [430033.482152] [<ffffffffc052a512>] svc_process_common+0x412/0x6a0 [sunrpc] [430033.482351] [<ffffffffc052a8c3>] svc_process+0x123/0x210 [sunrpc] [430033.482550] [<ffffffffc081190f>] nfsd+0xff/0x170 [nfsd] [430033.482744] [<ffffffffc0811810>] ? nfsd_destroy+0x80/0x80 [nfsd] [430033.482943] [<ffffffff810a7aeb>] kthread+0xcb/0xf0 [430033.483151] [<ffffffff816f10ea>] ? __schedule+0x24a/0x810 [430033.483354] [<ffffffff816f10ea>] ? __schedule+0x24a/0x810 [430033.483553] [<ffffffff810a7a20>] ? kthread_create_on_node+0x180/0x180 [430033.483777] [<ffffffff816f72a1>] ret_from_fork+0x61/0x90 [430033.483976] [<ffffffff810a7a20>] ? kthread_create_on_node+0x180/0x180 [430033.484191] Code: 83 c2 02 0f b7 f2 e8 18 dc 91 ff 66 90 eb bf 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 0f 1f 44 00 00 48 89 fb ba 00 00 02 00 <f0> 0f c1 17 89 d0 45 31 e4 45 31 ed c1 e8 10 66 39 d0 41 89 c6 [430033.485174] RIP [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0 [430033.485370] RSP <ffff88005ae97908> [430033.485566] CR2: 0000000000001088 [430033.486223] ---[ end trace 7264463cd1aac8f9 ]--- [430033.666368] Kernel panic - not syncing: Fatal exception Link: http://lkml.kernel.org/r/20200616183829.87211-4-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com> Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Gang He <ghe(a)suse.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Jun Piao <piaojun(a)huawei.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/suballoc.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) --- a/fs/ocfs2/suballoc.c~ocfs2-fix-panic-on-nfs-server-over-ocfs2 +++ a/fs/ocfs2/suballoc.c @@ -2825,9 +2825,12 @@ int ocfs2_test_inode_bit(struct ocfs2_su goto bail; } - inode_alloc_inode = - ocfs2_get_system_file_inode(osb, INODE_ALLOC_SYSTEM_INODE, - suballoc_slot); + if (suballoc_slot == (u16)OCFS2_INVALID_SLOT) + inode_alloc_inode = ocfs2_get_system_file_inode(osb, + GLOBAL_INODE_ALLOC_SYSTEM_INODE, suballoc_slot); + else + inode_alloc_inode = ocfs2_get_system_file_inode(osb, + INODE_ALLOC_SYSTEM_INODE, suballoc_slot); if (!inode_alloc_inode) { /* the error code could be inaccurate, but we are not able to * get the correct one. */ _ Patches currently in -mm which might be from junxiao.bi(a)oracle.com are

5 years

1
0
0 0

[merged] ocfs2-load-global_inode_alloc.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: ocfs2: load global_inode_alloc has been removed from the -mm tree. Its filename was ocfs2-load-global_inode_alloc.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Junxiao Bi <junxiao.bi(a)oracle.com> Subject: ocfs2: load global_inode_alloc Set global_inode_alloc as OCFS2_FIRST_ONLINE_SYSTEM_INODE, that will make it load during mount. It can be used to test whether some global/system inodes are valid. One use case is that nfsd will test whether root inode is valid. Link: http://lkml.kernel.org/r/20200616183829.87211-3-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com> Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Gang He <ghe(a)suse.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Jun Piao <piaojun(a)huawei.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/ocfs2_fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/ocfs2/ocfs2_fs.h~ocfs2-load-global_inode_alloc +++ a/fs/ocfs2/ocfs2_fs.h @@ -326,8 +326,8 @@ struct ocfs2_system_inode_info { enum { BAD_BLOCK_SYSTEM_INODE = 0, GLOBAL_INODE_ALLOC_SYSTEM_INODE, +#define OCFS2_FIRST_ONLINE_SYSTEM_INODE GLOBAL_INODE_ALLOC_SYSTEM_INODE SLOT_MAP_SYSTEM_INODE, -#define OCFS2_FIRST_ONLINE_SYSTEM_INODE SLOT_MAP_SYSTEM_INODE HEARTBEAT_SYSTEM_INODE, GLOBAL_BITMAP_SYSTEM_INODE, USER_QUOTA_SYSTEM_INODE, _ Patches currently in -mm which might be from junxiao.bi(a)oracle.com are

5 years

1
0
0 0

[merged] ocfs2-avoid-inode-removed-while-nfsd-access-it.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: ocfs2: avoid inode removal while nfsd is accessing it has been removed from the -mm tree. Its filename was ocfs2-avoid-inode-removed-while-nfsd-access-it.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Junxiao Bi <junxiao.bi(a)oracle.com> Subject: ocfs2: avoid inode removal while nfsd is accessing it Patch series "ocfs2: fix nfsd over ocfs2 issues", v2. This is a series of patches to fix issues on nfsd over ocfs2. patch 1 is to avoid inode removed while nfsd access it patch 2 & 3 is to fix a panic issue. This patch (of 4): When nfsd is getting file dentry using handle or parent dentry of some dentry, one cluster lock is used to avoid inode removed from other node, but it still could be removed from local node, so use a rw lock to avoid this. Link: http://lkml.kernel.org/r/20200616183829.87211-1-junxiao.bi@oracle.com Link: http://lkml.kernel.org/r/20200616183829.87211-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi(a)oracle.com> Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com> Cc: Changwei Ge <gechangwei(a)live.cn> Cc: Gang He <ghe(a)suse.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Jun Piao <piaojun(a)huawei.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/dlmglue.c | 17 ++++++++++++++++- fs/ocfs2/ocfs2.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-) --- a/fs/ocfs2/dlmglue.c~ocfs2-avoid-inode-removed-while-nfsd-access-it +++ a/fs/ocfs2/dlmglue.c @@ -689,6 +689,12 @@ static void ocfs2_nfs_sync_lock_res_init &ocfs2_nfs_sync_lops, osb); } +static void ocfs2_nfs_sync_lock_init(struct ocfs2_super *osb) +{ + ocfs2_nfs_sync_lock_res_init(&osb->osb_nfs_sync_lockres, osb); + init_rwsem(&osb->nfs_sync_rwlock); +} + void ocfs2_trim_fs_lock_res_init(struct ocfs2_super *osb) { struct ocfs2_lock_res *lockres = &osb->osb_trim_fs_lockres; @@ -2855,6 +2861,11 @@ int ocfs2_nfs_sync_lock(struct ocfs2_sup if (ocfs2_is_hard_readonly(osb)) return -EROFS; + if (ex) + down_write(&osb->nfs_sync_rwlock); + else + down_read(&osb->nfs_sync_rwlock); + if (ocfs2_mount_local(osb)) return 0; @@ -2873,6 +2884,10 @@ void ocfs2_nfs_sync_unlock(struct ocfs2_ if (!ocfs2_mount_local(osb)) ocfs2_cluster_unlock(osb, lockres, ex ? LKM_EXMODE : LKM_PRMODE); + if (ex) + up_write(&osb->nfs_sync_rwlock); + else + up_read(&osb->nfs_sync_rwlock); } int ocfs2_trim_fs_lock(struct ocfs2_super *osb, @@ -3340,7 +3355,7 @@ int ocfs2_dlm_init(struct ocfs2_super *o local: ocfs2_super_lock_res_init(&osb->osb_super_lockres, osb); ocfs2_rename_lock_res_init(&osb->osb_rename_lockres, osb); - ocfs2_nfs_sync_lock_res_init(&osb->osb_nfs_sync_lockres, osb); + ocfs2_nfs_sync_lock_init(osb); ocfs2_orphan_scan_lock_res_init(&osb->osb_orphan_scan.os_lockres, osb); osb->cconn = conn; --- a/fs/ocfs2/ocfs2.h~ocfs2-avoid-inode-removed-while-nfsd-access-it +++ a/fs/ocfs2/ocfs2.h @@ -395,6 +395,7 @@ struct ocfs2_super struct ocfs2_lock_res osb_super_lockres; struct ocfs2_lock_res osb_rename_lockres; struct ocfs2_lock_res osb_nfs_sync_lockres; + struct rw_semaphore nfs_sync_rwlock; struct ocfs2_lock_res osb_trim_fs_lockres; struct mutex obs_trim_fs_mutex; struct ocfs2_dlm_debug *osb_dlm_debug; _ Patches currently in -mm which might be from junxiao.bi(a)oracle.com are

5 years

1
0
0 0

[merged] mm-compaction-make-capture-control-handling-safe-wrt-interrupts.patch removed from -mm tree

by Andrew Morton

The patch titled Subject: mm, compaction: make capture control handling safe wrt interrupts has been removed from the -mm tree. Its filename was mm-compaction-make-capture-control-handling-safe-wrt-interrupts.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Vlastimil Babka <vbabka(a)suse.cz> Subject: mm, compaction: make capture control handling safe wrt interrupts Hugh reports: : While stressing compaction, one run oopsed on NULL capc->cc in : __free_one_page()'s task_capc(zone): compact_zone_order() had been : interrupted, and a page was being freed in the return from interrupt. : : Though you would not expect it from the source, both gccs I was using (a : 4.8.1 and a 7.5.0) had chosen to compile compact_zone_order() with the : ".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before callq : compact_zone - long after the "current->capture_control = &capc". An : interrupt in between those finds capc->cc NULL (zeroed by an earlier rep : stos). : : This could presumably be fixed by a barrier() before setting : current->capture_control in compact_zone_order(); but would also need more : care on return from compact_zone(), in order not to risk leaking a page : captured by interrupt just before capture_control is reset. : : Maybe that is the preferable fix, but I felt safer for task_capc() to : exclude the rather surprising possibility of capture at interrupt time. I have checked that gcc10 also behaves the same. The advantage of fix in compact_zone_order() is that we don't add another test in the page freeing hot path, and that it might prevent future problems if we stop exposing pointers to uninitialized structures in current task. So this patch implements the suggestion for compact_zone_order() with barrier() (and WRITE_ONCE() to prevent store tearing) for setting current->capture_control, and prevents page leaking with WRITE_ONCE/READ_ONCE in the proper order. Link: http://lkml.kernel.org/r/20200616082649.27173-1-vbabka@suse.cz Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction") Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz> Reported-by: Hugh Dickins <hughd(a)google.com> Suggested-by: Hugh Dickins <hughd(a)google.com> Acked-by: Hugh Dickins <hughd(a)google.com> Cc: Alex Shi <alex.shi(a)linux.alibaba.com> Cc: Li Wang <liwang(a)redhat.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: <stable(a)vger.kernel.org> [5.1+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/compaction.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) --- a/mm/compaction.c~mm-compaction-make-capture-control-handling-safe-wrt-interrupts +++ a/mm/compaction.c @@ -2316,15 +2316,26 @@ static enum compact_result compact_zone_ .page = NULL, }; - current->capture_control = &capc; + /* + * Make sure the structs are really initialized before we expose the + * capture control, in case we are interrupted and the interrupt handler + * frees a page. + */ + barrier(); + WRITE_ONCE(current->capture_control, &capc); ret = compact_zone(&cc, &capc); VM_BUG_ON(!list_empty(&cc.freepages)); VM_BUG_ON(!list_empty(&cc.migratepages)); - *capture = capc.page; - current->capture_control = NULL; + /* + * Make sure we hide capture control first before we read the captured + * page pointer, otherwise an interrupt could free and capture a page + * and we would leak it. + */ + WRITE_ONCE(current->capture_control, NULL); + *capture = READ_ONCE(capc.page); return ret; } _ Patches currently in -mm which might be from vbabka(a)suse.cz are mm-slub-extend-slub_debug-syntax-for-multiple-blocks.patch mm-slub-make-some-slub_debug-related-attributes-read-only.patch mm-slub-remove-runtime-allocation-order-changes.patch mm-slub-make-remaining-slub_debug-related-attributes-read-only.patch mm-slub-make-reclaim_account-attribute-read-only.patch mm-slub-introduce-static-key-for-slub_debug.patch mm-slub-introduce-kmem_cache_debug_flags.patch mm-slub-introduce-kmem_cache_debug_flags-fix.patch mm-slub-extend-checks-guarded-by-slub_debug-static-key.patch mm-slab-slub-move-and-improve-cache_from_obj.patch mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj.patch mm-slab-slub-improve-error-reporting-and-overhead-of-cache_from_obj-fix.patch mm-page_alloc-use-unlikely-in-task_capc.patch

5 years

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror June 2020