Linux-stable-mirror

linux-stable-mirror@lists.linaro.org

326 participants
124230 discussions

by Jack Wang

Hi Greg, We hit a panic in 4.14.43 with md raid1. It's easy to reproduce with a test case, running Fio, and doing mdadm --grow bitmap=none /dev/mdx && mdadm --grow bitmap=internal /dev/mdx in a loop. With following patches applied, we can no longer reproduce the problem. Please consider to apply the patches to 4.14, they can be applied cleanly on 4.14.52. Cheers, Jack NeilBrown (6): md: always hold reconfig_mutex when calling mddev_suspend() md: don't call bitmap_create() while array is quiesced. md: move suspend_hi/lo handling into core md code md: use mddev_suspend/resume instead of ->quiesce() md: allow metadata update while suspending. md: remove special meaning of ->quiesce(.., 2) drivers/md/dm-raid.c | 10 ++++-- drivers/md/md-cluster.c | 6 ++-- drivers/md/md.c | 90 ++++++++++++++++++++++++++++++------------------ drivers/md/md.h | 15 +++++--- drivers/md/raid0.c | 2 +- drivers/md/raid1.c | 27 +++++---------- drivers/md/raid10.c | 10 ++---- drivers/md/raid5-cache.c | 30 ++++++++++------ drivers/md/raid5-log.h | 2 +- drivers/md/raid5.c | 40 ++++----------------- 10 files changed, 116 insertions(+), 116 deletions(-) -- 2.7.4

7 years, 6 months

[PATCH 6/6] tracing: Fix missing return symbol in function_graph output

by Steven Rostedt

From: Changbin Du <changbin.du(a)intel.com> The function_graph tracer does not show the interrupt return marker for the leaf entry. On leaf entries, we see an unbalanced interrupt marker (the interrupt was entered, but nevern left). Before: 1) | SyS_write() { 1) | __fdget_pos() { 1) 0.061 us | __fget_light(); 1) 0.289 us | } 1) | vfs_write() { 1) 0.049 us | rw_verify_area(); 1) + 15.424 us | __vfs_write(); 1) ==========> | 1) 6.003 us | smp_apic_timer_interrupt(); 1) 0.055 us | __fsnotify_parent(); 1) 0.073 us | fsnotify(); 1) + 23.665 us | } 1) + 24.501 us | } After: 0) | SyS_write() { 0) | __fdget_pos() { 0) 0.052 us | __fget_light(); 0) 0.328 us | } 0) | vfs_write() { 0) 0.057 us | rw_verify_area(); 0) | __vfs_write() { 0) ==========> | 0) 8.548 us | smp_apic_timer_interrupt(); 0) <========== | 0) + 36.507 us | } /* __vfs_write */ 0) 0.049 us | __fsnotify_parent(); 0) 0.066 us | fsnotify(); 0) + 50.064 us | } 0) + 50.952 us | } Link: http://lkml.kernel.org/r/1517413729-20411-1-git-send-email-changbin.du@inte… Cc: stable(a)vger.kernel.org Fixes: f8b755ac8e0cc ("tracing/function-graph-tracer: Output arrows signal on hardirq call/return") Signed-off-by: Changbin Du <changbin.du(a)intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/trace_functions_graph.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c index 23c0b0cb5fb9..169b3c44ee97 100644 --- a/kernel/trace/trace_functions_graph.c +++ b/kernel/trace/trace_functions_graph.c @@ -831,6 +831,7 @@ print_graph_entry_leaf(struct trace_iterator *iter, struct ftrace_graph_ret *graph_ret; struct ftrace_graph_ent *call; unsigned long long duration; + int cpu = iter->cpu; int i; graph_ret = &ret_entry->ret; @@ -839,7 +840,6 @@ print_graph_entry_leaf(struct trace_iterator *iter, if (data) { struct fgraph_cpu_data *cpu_data; - int cpu = iter->cpu; cpu_data = per_cpu_ptr(data->cpu_data, cpu); @@ -869,6 +869,9 @@ print_graph_entry_leaf(struct trace_iterator *iter, trace_seq_printf(s, "%ps();\n", (void *)call->func); + print_graph_irq(iter, graph_ret->func, TRACE_GRAPH_RET, + cpu, iter->ent->pid, flags); + return trace_handle_return(s); } -- 2.17.1

7 years, 6 months

[PATCH 1/6] tracing: Avoid string overflow

by Steven Rostedt

From: Arnd Bergmann <arnd(a)arndb.de> 'err' is used as a NUL-terminated string, but using strncpy() with the length equal to the buffer size may result in lack of the termination: kernel/trace/trace_events_hist.c: In function 'hist_err_event': kernel/trace/trace_events_hist.c:396:3: error: 'strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation] strncpy(err, var, MAX_FILTER_STR_VAL); This changes it to use the safer strscpy() instead. Link: http://lkml.kernel.org/r/20180328140920.2842153-1-arnd@arndb.de Cc: stable(a)vger.kernel.org Fixes: f404da6e1d46 ("tracing: Add 'last error' error facility for hist triggers") Acked-by: Tom Zanussi <tom.zanussi(a)linux.intel.com> Signed-off-by: Arnd Bergmann <arnd(a)arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> --- kernel/trace/trace_events_hist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 046c716a6536..aae18af94c94 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -393,7 +393,7 @@ static void hist_err_event(char *str, char *system, char *event, char *var) else if (system) snprintf(err, MAX_FILTER_STR_VAL, "%s.%s", system, event); else - strncpy(err, var, MAX_FILTER_STR_VAL); + strscpy(err, var, MAX_FILTER_STR_VAL); hist_err(str, err); } -- 2.17.1

7 years, 6 months

[PATCH] arm64: dts: stratix10: Fix SPI nodes for Stratix10

by thor.thayer＠linux.intel.com

From: Thor Thayer <thor.thayer(a)linux.intel.com> Patch did not apply cleanly to 4.9 and 4.14 stable branches because of additional node properties added after 4.14. So backport patch and cleanup commit 4595299c5eae ("arm64: dts: stratix10: Fix SPI nodes for Stratix10") Remove the unused bus-num node and change num-chipselect to num-cs to match SPI bindings. Cc: stable(a)vger.kernel.org # 4.14.x Cc: stable(a)vger.kernel.org # 4.9.x Fixes: 78cd6a9d8e154 ("arm64: dts: Add base stratix 10 dtsi") Signed-off-by: Thor Thayer <thor.thayer(a)linux.intel.com> Signed-off-by: Dinh Nguyen <dinguyen(a)kernel.org> Signed-off-by: Olof Johansson <olof(a)lixom.net> --- arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi index c2b9bcb0ef61..57f75453ee93 100644 --- a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi +++ b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi @@ -231,8 +231,7 @@ #size-cells = <0>; reg = <0xffda4000 0x1000>; interrupts = <0 101 4>; - num-chipselect = <4>; - bus-num = <0>; + num-cs = <4>; status = "disabled"; }; @@ -242,8 +241,7 @@ #size-cells = <0>; reg = <0xffda5000 0x1000>; interrupts = <0 102 4>; - num-chipselect = <4>; - bus-num = <0>; + num-cs = <4>; status = "disabled"; }; -- 2.7.4

7 years, 6 months

FAILED: patch "[PATCH] dm: prevent DAX mounts if not supported" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From dbc626597c39b24cefce09fbd8e9dea85869a801 Mon Sep 17 00:00:00 2001 From: Ross Zwisler <ross.zwisler(a)linux.intel.com> Date: Tue, 26 Jun 2018 16:30:41 -0600 Subject: [PATCH] dm: prevent DAX mounts if not supported Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX flag is set on the device's request queue to decide whether or not the device supports filesystem DAX. Really we should be using bdev_dax_supported() like filesystems do at mount time. This performs other tests like checking to make sure the dax_direct_access() path works. We also explicitly clear QUEUE_FLAG_DAX on the DM device's request queue if any of the underlying devices do not support DAX. This makes the handling of QUEUE_FLAG_DAX consistent with the setting/clearing of most other flags in dm_table_set_restrictions(). Now that bdev_dax_supported() explicitly checks for QUEUE_FLAG_DAX, this will ensure that filesystems built upon DM devices will only be able to mount with DAX if all underlying devices also support DAX. Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support") Cc: stable(a)vger.kernel.org Acked-by: Dan Williams <dan.j.williams(a)intel.com> Reviewed-by: Toshi Kani <toshi.kani(a)hpe.com> Signed-off-by: Mike Snitzer <snitzer(a)redhat.com> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 938766794c2e..3d0e2c198f06 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -885,9 +885,7 @@ EXPORT_SYMBOL_GPL(dm_table_set_type); static int device_supports_dax(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data) { - struct request_queue *q = bdev_get_queue(dev->bdev); - - return q && blk_queue_dax(q); + return bdev_dax_supported(dev->bdev, PAGE_SIZE); } static bool dm_table_supports_dax(struct dm_table *t) @@ -1907,6 +1905,9 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (dm_table_supports_dax(t)) blk_queue_flag_set(QUEUE_FLAG_DAX, q); + else + blk_queue_flag_clear(QUEUE_FLAG_DAX, q); + if (dm_table_supports_dax_write_cache(t)) dax_write_cache(t->md->dax_dev, true); diff --git a/drivers/md/dm.c b/drivers/md/dm.c index a3b103e8e3ce..b0dd7027848b 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1056,8 +1056,7 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, if (len < 1) goto out; nr_pages = min(len, nr_pages); - if (ti->type->direct_access) - ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn); + ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn); out: dm_put_live_table(md, srcu_idx);

7 years, 6 months

[PATCH] MIPS: Fix ioremap() RAM check

by Paul Burton

We currently attempt to check whether a physical address range provided to __ioremap() may be in use by the page allocator by examining the value of PageReserved for each page in the region - lowmem pages not marked reserved are presumed to be in use by the page allocator, and requests to ioremap them fail. The way we check this has been broken since commit 92923ca3aace ("mm: meminit: only set page reserved in the memblock region"), because memblock will typically not have any knowledge of non-RAM pages and therefore those pages will not have the PageReserved flag set. Thus when we attempt to ioremap a region outside of RAM we incorrectly fail believing that the region is RAM that may be in use. In most cases ioremap() on MIPS will take a fast-path to use the unmapped kseg1 or xkphys virtual address spaces and never hit this path, so the only way to hit it is for a MIPS32 system to attempt to ioremap() an address range in lowmem with flags other than _CACHE_UNCACHED. Perhaps the most straightforward way to do this is using ioremap_uncached_accelerated(), which is how the problem was discovered. Fix this by making use of walk_system_ram_range() to test the address range provided to __ioremap() against only RAM pages, rather than all lowmem pages. This means that if we have a lowmem I/O region, which is very common for MIPS systems, we're free to ioremap() address ranges within it. A nice bonus is that the test is no longer limited to lowmem. The approach here matches the way x86 performed the same test after commit c81c8a1eeede ("x86, ioremap: Speed up check for RAM pages") until x86 moved towards a slightly more complicated check using walk_mem_res() for unrelated reasons with commit 0e4c12b45aa8 ("x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages"). Signed-off-by: Paul Burton <paul.burton(a)mips.com> Reported-by: Serge Semin <fancer.lancer(a)gmail.com> Tested-by: Serge Semin <fancer.lancer(a)gmail.com> Fixes: 92923ca3aace ("mm: meminit: only set page reserved in the memblock region") Cc: James Hogan <jhogan(a)kernel.org> Cc: Ralf Baechle <ralf(a)linux-mips.org> Cc: linux-mips(a)linux-mips.org Cc: stable(a)vger.kernel.org # v4.2+ --- arch/mips/mm/ioremap.c | 37 +++++++++++++++++++++++++------------ 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/arch/mips/mm/ioremap.c b/arch/mips/mm/ioremap.c index 1986e09fb457..1601d90b087b 100644 --- a/arch/mips/mm/ioremap.c +++ b/arch/mips/mm/ioremap.c @@ -9,6 +9,7 @@ #include <linux/export.h> #include <asm/addrspace.h> #include <asm/byteorder.h> +#include <linux/ioport.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/vmalloc.h> @@ -98,6 +99,20 @@ static int remap_area_pages(unsigned long address, phys_addr_t phys_addr, return error; } +static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages, + void *arg) +{ + unsigned long i; + + for (i = 0; i < nr_pages; i++) { + if (pfn_valid(start_pfn + i) && + !PageReserved(pfn_to_page(start_pfn + i))) + return 1; + } + + return 0; +} + /* * Generic mapping function (not visible outside): */ @@ -116,8 +131,8 @@ static int remap_area_pages(unsigned long address, phys_addr_t phys_addr, void __iomem * __ioremap(phys_addr_t phys_addr, phys_addr_t size, unsigned long flags) { + unsigned long offset, pfn, last_pfn; struct vm_struct * area; - unsigned long offset; phys_addr_t last_addr; void * addr; @@ -137,18 +152,16 @@ void __iomem * __ioremap(phys_addr_t phys_addr, phys_addr_t size, unsigned long return (void __iomem *) CKSEG1ADDR(phys_addr); /* - * Don't allow anybody to remap normal RAM that we're using.. + * Don't allow anybody to remap RAM that may be allocated by the page + * allocator, since that could lead to races & data clobbering. */ - if (phys_addr < virt_to_phys(high_memory)) { - char *t_addr, *t_end; - struct page *page; - - t_addr = __va(phys_addr); - t_end = t_addr + (size - 1); - - for(page = virt_to_page(t_addr); page <= virt_to_page(t_end); page++) - if(!PageReserved(page)) - return NULL; + pfn = PFN_DOWN(phys_addr); + last_pfn = PFN_DOWN(last_addr); + if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL, + __ioremap_check_ram) == 1) { + WARN_ONCE(1, "ioremap on RAM at %pa - %pa\n", + &phys_addr, &last_addr); + return NULL; } /* -- 2.18.0

7 years, 6 months

[PATCH v2] dm-zoned: Avoid triggering reclaim from inside dmz_map()

by Bart Van Assche

This patch avoids that lockdep reports the following: ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc1 #62 Not tainted ------------------------------------------------------ kswapd0/84 is trying to acquire lock: 00000000c313516d (&xfs_nondir_ilock_class){++++}, at: xfs_free_eofblocks+0xa2/0x1e0 but task is already holding lock: 00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (fs_reclaim){+.+.}: kmem_cache_alloc+0x2c/0x2b0 radix_tree_node_alloc.constprop.19+0x3d/0xc0 __radix_tree_create+0x161/0x1c0 __radix_tree_insert+0x45/0x210 dmz_map+0x245/0x2d0 [dm_zoned] __map_bio+0x40/0x260 __split_and_process_non_flush+0x116/0x220 __split_and_process_bio+0x81/0x180 __dm_make_request.isra.32+0x5a/0x100 generic_make_request+0x36e/0x690 submit_bio+0x6c/0x140 mpage_readpages+0x19e/0x1f0 read_pages+0x6d/0x1b0 __do_page_cache_readahead+0x21b/0x2d0 force_page_cache_readahead+0xc4/0x100 generic_file_read_iter+0x7c6/0xd20 __vfs_read+0x102/0x180 vfs_read+0x9b/0x140 ksys_read+0x55/0xc0 do_syscall_64+0x5a/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&dmz->chunk_lock){+.+.}: dmz_map+0x133/0x2d0 [dm_zoned] __map_bio+0x40/0x260 __split_and_process_non_flush+0x116/0x220 __split_and_process_bio+0x81/0x180 __dm_make_request.isra.32+0x5a/0x100 generic_make_request+0x36e/0x690 submit_bio+0x6c/0x140 _xfs_buf_ioapply+0x31c/0x590 xfs_buf_submit_wait+0x73/0x520 xfs_buf_read_map+0x134/0x2f0 xfs_trans_read_buf_map+0xc3/0x580 xfs_read_agf+0xa5/0x1e0 xfs_alloc_read_agf+0x59/0x2b0 xfs_alloc_pagf_init+0x27/0x60 xfs_bmap_longest_free_extent+0x43/0xb0 xfs_bmap_btalloc_nullfb+0x7f/0xf0 xfs_bmap_btalloc+0x428/0x7c0 xfs_bmapi_write+0x598/0xcc0 xfs_iomap_write_allocate+0x15a/0x330 xfs_map_blocks+0x1cf/0x3f0 xfs_do_writepage+0x15f/0x7b0 write_cache_pages+0x1ca/0x540 xfs_vm_writepages+0x65/0xa0 do_writepages+0x48/0xf0 __writeback_single_inode+0x58/0x730 writeback_sb_inodes+0x249/0x5c0 wb_writeback+0x11e/0x550 wb_workfn+0xa3/0x670 process_one_work+0x228/0x670 worker_thread+0x3c/0x390 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 -> #0 (&xfs_nondir_ilock_class){++++}: down_read_nested+0x43/0x70 xfs_free_eofblocks+0xa2/0x1e0 xfs_fs_destroy_inode+0xac/0x270 dispose_list+0x51/0x80 prune_icache_sb+0x52/0x70 super_cache_scan+0x127/0x1a0 shrink_slab.part.47+0x1bd/0x590 shrink_node+0x3b5/0x470 balance_pgdat+0x158/0x3b0 kswapd+0x1ba/0x600 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 other info that might help us debug this: Chain exists of: &xfs_nondir_ilock_class --> &dmz->chunk_lock --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&dmz->chunk_lock); lock(fs_reclaim); lock(&xfs_nondir_ilock_class); *** DEADLOCK *** 3 locks held by kswapd0/84: #0: 00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 #1: 000000000f8208f5 (shrinker_rwsem){++++}, at: shrink_slab.part.47+0x3f/0x590 #2: 00000000cacefa54 (&type->s_umount_key#43){.+.+}, at: trylock_super+0x16/0x50 stack backtrace: CPU: 7 PID: 84 Comm: kswapd0 Not tainted 4.18.0-rc1 #62 Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0 12/17/2015 Call Trace: dump_stack+0x85/0xcb print_circular_bug.isra.36+0x1ce/0x1db __lock_acquire+0x124e/0x1310 lock_acquire+0x9f/0x1f0 down_read_nested+0x43/0x70 xfs_free_eofblocks+0xa2/0x1e0 xfs_fs_destroy_inode+0xac/0x270 dispose_list+0x51/0x80 prune_icache_sb+0x52/0x70 super_cache_scan+0x127/0x1a0 shrink_slab.part.47+0x1bd/0x590 shrink_node+0x3b5/0x470 balance_pgdat+0x158/0x3b0 kswapd+0x1ba/0x600 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 Reported-by: Masato Suzuki <masato.suzuki(a)wdc.com> Fixes: 4218a9554653 ("dm zoned: use GFP_NOIO in I/O path") Signed-off-by: Bart Van Assche <bart.vanassche(a)wdc.com> Cc: Damien Le Moal <Damien.LeMoal(a)wdc.com> Cc: Mikulas Patocka <mpatocka(a)redhat.com> Cc: <stable(a)vger.kernel.org> --- Changes compared to v1: added "Cc: stable" drivers/md/dm-zoned-target.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c index 3c0e45f4dcf5..a44183ff4be0 100644 --- a/drivers/md/dm-zoned-target.c +++ b/drivers/md/dm-zoned-target.c @@ -787,7 +787,7 @@ static int dmz_ctr(struct dm_target *ti, unsigned int argc, char **argv) /* Chunk BIO work */ mutex_init(&dmz->chunk_lock); - INIT_RADIX_TREE(&dmz->chunk_rxtree, GFP_KERNEL); + INIT_RADIX_TREE(&dmz->chunk_rxtree, GFP_NOIO); dmz->chunk_wq = alloc_workqueue("dmz_cwq_%s", WQ_MEM_RECLAIM | WQ_UNBOUND, 0, dev->name); if (!dmz->chunk_wq) { -- 2.17.1

7 years, 6 months

[merged] mm-teach-dump_page-to-correctly-output-poisoned-struct-pages.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm: teach dump_page() to correctly output poisoned struct pages has been removed from the -mm tree. Its filename was mm-teach-dump_page-to-correctly-output-poisoned-struct-pages.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Pavel Tatashin <pasha.tatashin(a)oracle.com> Subject: mm: teach dump_page() to correctly output poisoned struct pages If struct page is poisoned, and uninitialized access is detected via PF_POISONED_CHECK(page) dump_page() is called to output the page. But, the dump_page() itself accesses struct page to determine how to print it, and therefore gets into a recursive loop. For example: dump_page() __dump_page() PageSlab(page) PF_POISONED_CHECK(page) VM_BUG_ON_PGFLAGS(PagePoisoned(page), page) dump_page() recursion loop. Link: http://lkml.kernel.org/r/20180702180536.2552-1-pasha.tatashin@oracle.com Fixes: f165b378bbdf ("mm: uninitialized struct page poisoning sanity checking") Signed-off-by: Pavel Tatashin <pasha.tatashin(a)oracle.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/debug.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff -puN mm/debug.c~mm-teach-dump_page-to-correctly-output-poisoned-struct-pages mm/debug.c --- a/mm/debug.c~mm-teach-dump_page-to-correctly-output-poisoned-struct-pages +++ a/mm/debug.c @@ -43,12 +43,25 @@ const struct trace_print_flags vmaflag_n void __dump_page(struct page *page, const char *reason) { + bool page_poisoned = PagePoisoned(page); + int mapcount; + + /* + * If struct page is poisoned don't access Page*() functions as that + * leads to recursive loop. Page*() check for poisoned pages, and calls + * dump_page() when detected. + */ + if (page_poisoned) { + pr_emerg("page:%px is uninitialized and poisoned", page); + goto hex_only; + } + /* * Avoid VM_BUG_ON() in page_mapcount(). * page->_mapcount space in struct page is used by sl[aou]b pages to * encode own info. */ - int mapcount = PageSlab(page) ? 0 : page_mapcount(page); + mapcount = PageSlab(page) ? 0 : page_mapcount(page); pr_emerg("page:%px count:%d mapcount:%d mapping:%px index:%#lx", page, page_ref_count(page), mapcount, @@ -60,6 +73,7 @@ void __dump_page(struct page *page, cons pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags); +hex_only: print_hex_dump(KERN_ALERT, "raw: ", DUMP_PREFIX_NONE, 32, sizeof(unsigned long), page, sizeof(struct page), false); @@ -68,7 +82,7 @@ void __dump_page(struct page *page, cons pr_alert("page dumped because: %s\n", reason); #ifdef CONFIG_MEMCG - if (page->mem_cgroup) + if (!page_poisoned && page->mem_cgroup) pr_alert("page->mem_cgroup:%px\n", page->mem_cgroup); #endif } _ Patches currently in -mm which might be from pasha.tatashin(a)oracle.com are mm-skip-invalid-pages-block-at-a-time-in-zero_resv_unresv.patch sparc64-ng4-memset-32-bits-overflow.patch

7 years, 6 months

[merged] mm-hugetlb-yield-when-prepping-struct-pages.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm: hugetlb: yield when prepping struct pages has been removed from the -mm tree. Its filename was mm-hugetlb-yield-when-prepping-struct-pages.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Cannon Matthews <cannonmatthews(a)google.com> Subject: mm: hugetlb: yield when prepping struct pages When booting with very large numbers of gigantic (i.e. 1G) pages, the operations in the loop of gather_bootmem_prealloc, and specifically prep_compound_gigantic_page, takes a very long time, and can cause a softlockup if enough pages are requested at boot. For example booting with 3844 1G pages requires prepping (set_compound_head, init the count) over 1 billion 4K tail pages, which takes considerable time. Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to prevent this lockup. Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and no softlockup is reported, and the hugepages are reported as successfully setup. Link: http://lkml.kernel.org/r/20180627214447.260804-1-cannonmatthews@google.com Signed-off-by: Cannon Matthews <cannonmatthews(a)google.com> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Andres Lagar-Cavilla <andreslc(a)google.com> Cc: Peter Feiner <pfeiner(a)google.com> Cc: Greg Thelen <gthelen(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/hugetlb.c | 1 + 1 file changed, 1 insertion(+) diff -puN mm/hugetlb.c~mm-hugetlb-yield-when-prepping-struct-pages mm/hugetlb.c --- a/mm/hugetlb.c~mm-hugetlb-yield-when-prepping-struct-pages +++ a/mm/hugetlb.c @@ -2163,6 +2163,7 @@ static void __init gather_bootmem_preall */ if (hstate_is_gigantic(h)) adjust_managed_page_count(page, 1 << h->order); + cond_resched(); } } _ Patches currently in -mm which might be from cannonmatthews(a)google.com are

7 years, 6 months

[merged] userfaultfd-hugetlbfs-fix-userfaultfd_huge_must_wait-pte-access.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access has been removed from the -mm tree. Its filename was userfaultfd-hugetlbfs-fix-userfaultfd_huge_must_wait-pte-access.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Janosch Frank <frankja(a)linux.ibm.com> Subject: userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access Use huge_ptep_get() to translate huge ptes to normal ptes so we can check them with the huge_pte_* functions. Otherwise some architectures will check the wrong values and will not wait for userspace to bring in the memory. Link: http://lkml.kernel.org/r/20180626132421.78084-1-frankja@linux.ibm.com Fixes: 369cd2121be4 ("userfaultfd: hugetlbfs: userfaultfd_huge_must_wait for hugepmd ranges") Signed-off-by: Janosch Frank <frankja(a)linux.ibm.com> Reviewed-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/userfaultfd.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff -puN fs/userfaultfd.c~userfaultfd-hugetlbfs-fix-userfaultfd_huge_must_wait-pte-access fs/userfaultfd.c --- a/fs/userfaultfd.c~userfaultfd-hugetlbfs-fix-userfaultfd_huge_must_wait-pte-access +++ a/fs/userfaultfd.c @@ -222,24 +222,26 @@ static inline bool userfaultfd_huge_must unsigned long reason) { struct mm_struct *mm = ctx->mm; - pte_t *pte; + pte_t *ptep, pte; bool ret = true; VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem)); - pte = huge_pte_offset(mm, address, vma_mmu_pagesize(vma)); - if (!pte) + ptep = huge_pte_offset(mm, address, vma_mmu_pagesize(vma)); + + if (!ptep) goto out; ret = false; + pte = huge_ptep_get(ptep); /* * Lockless access: we're in a wait_event so it's ok if it * changes under us. */ - if (huge_pte_none(*pte)) + if (huge_pte_none(pte)) ret = true; - if (!huge_pte_write(*pte) && (reason & VM_UFFD_WP)) + if (!huge_pte_write(pte) && (reason & VM_UFFD_WP)) ret = true; out: return ret; _ Patches currently in -mm which might be from frankja(a)linux.ibm.com are

7 years, 6 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror