October 2024 - Linux-stable-mirror

[PATCH can] can: mcp251xfd: mcp251xfd_ring_alloc(): fix coalescing configuration when switching CAN modes

by Marc Kleine-Budde

Since commit 50ea5449c563 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode"), the current ring and coalescing configuration is passed to can_ram_get_layout(). That fixed the issue when switching between CAN-CC and CAN-FD mode with configured ring (rx, tx) and/or coalescing parameters (rx-frames-irq, tx-frames-irq). However 50ea5449c563 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode"), introduced a regression when switching CAN modes with disabled coalescing configuration: Even if the previous CAN mode has no coalescing configured, the new mode is configured with active coalescing. This leads to delayed receiving of CAN-FD frames. This comes from the fact, that ethtool uses usecs = 0 and max_frames = 1 to disable coalescing, however the driver uses internally priv->{rx,tx}_obj_num_coalesce_irq = 0 to indicate disabled coalescing. Fix the regression by assigning struct ethtool_coalesce ec->{rx,tx}_max_coalesced_frames_irq = 1 if coalescing is disabled in the driver as can_ram_get_layout() expects this. Reported-by: https://github.com/vdh-robothania Closes: https://github.com/raspberrypi/linux/issues/6407 Fixes: 50ea5449c563 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode") Cc: stable(a)vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de> --- drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c index e684991fa3917d4f6b6ebda8329f72971237574e..7209a831f0f2089e409c6be635f0e5dc7b2271da 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c @@ -2,7 +2,7 @@ // // mcp251xfd - Microchip MCP251xFD Family CAN controller driver // -// Copyright (c) 2019, 2020, 2021 Pengutronix, +// Copyright (c) 2019, 2020, 2021, 2024 Pengutronix, // Marc Kleine-Budde <kernel(a)pengutronix.de> // // Based on: @@ -483,9 +483,11 @@ int mcp251xfd_ring_alloc(struct mcp251xfd_priv *priv) }; const struct ethtool_coalesce ec = { .rx_coalesce_usecs_irq = priv->rx_coalesce_usecs_irq, - .rx_max_coalesced_frames_irq = priv->rx_obj_num_coalesce_irq, + .rx_max_coalesced_frames_irq = priv->rx_obj_num_coalesce_irq == 0 ? + 1 : priv->rx_obj_num_coalesce_irq, .tx_coalesce_usecs_irq = priv->tx_coalesce_usecs_irq, - .tx_max_coalesced_frames_irq = priv->tx_obj_num_coalesce_irq, + .tx_max_coalesced_frames_irq = priv->tx_obj_num_coalesce_irq == 0 ? + 1 : priv->tx_obj_num_coalesce_irq, }; struct can_ram_layout layout; --- base-commit: 9efc44fb2dba6138b0575826319200049078679a change-id: 20241010-mcp251xfd-fix-coalesing-f373066dd42e Best regards, -- Marc Kleine-Budde <mkl(a)pengutronix.de>

1 year, 1 month

2
1
0 0

[PATCHSET v5.1 3/9] xfs: metadata inode directory trees

by Darrick J. Wong

Hi all, This series delivers a new feature -- metadata inode directories. This is a separate directory tree (rooted in the superblock) that contains only inodes that contain filesystem metadata. Different metadata objects can be looked up with regular paths. Start by creating xfs_imeta{dir,file}* functions to mediate access to the metadata directory tree. By the end of this mega series, all existing metadata inodes (rt+quota) will use this directory tree instead of the superblock. Next, define the metadir on-disk format, which consists of marking inodes with a new iflag that says they're metadata. This prevents bulkstat and friends from ever getting their hands on fs metadata files. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=me… xfsprogs git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h… --- Commits in this patchset: * xfs: constify the xfs_sb predicates * xfs: constify the xfs_inode predicates * xfs: rename metadata inode predicates * xfs: standardize EXPERIMENTAL warning generation * xfs: define the on-disk format for the metadir feature * xfs: iget for metadata inodes * xfs: load metadata directory root at mount time * xfs: enforce metadata inode flag * xfs: read and write metadata inode directory tree * xfs: disable the agi rotor for metadata inodes * xfs: hide metadata inodes from everyone because they are special * xfs: advertise metadata directory feature * xfs: allow bulkstat to return metadata directories * xfs: don't count metadata directory files to quota * xfs: mark quota inodes as metadata files * xfs: adjust xfs_bmap_add_attrfork for metadir * xfs: record health problems with the metadata directory * xfs: refactor directory tree root predicates * xfs: do not count metadata directory files when doing online quotacheck * xfs: don't fail repairs on metadata files with no attr fork * xfs: metadata files can have xattrs if metadir is enabled * xfs: adjust parent pointer scrubber for sb-rooted metadata files * xfs: fix di_metatype field of inodes that won't load * xfs: scrub metadata directories * xfs: check the metadata directory inumber in superblocks * xfs: move repair temporary files to the metadata directory tree * xfs: check metadata directory file path connectivity * xfs: confirm dotdot target before replacing it during a repair * xfs: repair metadata directory file path connectivity --- fs/xfs/Makefile | 5 fs/xfs/libxfs/xfs_attr.c | 5 fs/xfs/libxfs/xfs_bmap.c | 5 fs/xfs/libxfs/xfs_format.h | 121 +++++++-- fs/xfs/libxfs/xfs_fs.h | 25 ++ fs/xfs/libxfs/xfs_health.h | 6 fs/xfs/libxfs/xfs_ialloc.c | 58 +++- fs/xfs/libxfs/xfs_inode_buf.c | 90 ++++++- fs/xfs/libxfs/xfs_inode_buf.h | 3 fs/xfs/libxfs/xfs_inode_util.c | 2 fs/xfs/libxfs/xfs_log_format.h | 2 fs/xfs/libxfs/xfs_metadir.c | 481 ++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_metadir.h | 47 ++++ fs/xfs/libxfs/xfs_metafile.c | 52 ++++ fs/xfs/libxfs/xfs_metafile.h | 31 ++ fs/xfs/libxfs/xfs_ondisk.h | 2 fs/xfs/libxfs/xfs_sb.c | 12 + fs/xfs/libxfs/xfs_types.c | 4 fs/xfs/libxfs/xfs_types.h | 2 fs/xfs/scrub/agheader.c | 5 fs/xfs/scrub/common.c | 65 ++++- fs/xfs/scrub/common.h | 5 fs/xfs/scrub/dir.c | 10 + fs/xfs/scrub/dir_repair.c | 20 + fs/xfs/scrub/dirtree.c | 32 ++ fs/xfs/scrub/dirtree.h | 12 - fs/xfs/scrub/findparent.c | 28 ++ fs/xfs/scrub/health.c | 1 fs/xfs/scrub/inode.c | 35 ++- fs/xfs/scrub/inode_repair.c | 34 ++- fs/xfs/scrub/metapath.c | 521 +++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/nlinks.c | 4 fs/xfs/scrub/nlinks_repair.c | 4 fs/xfs/scrub/orphanage.c | 4 fs/xfs/scrub/parent.c | 39 ++- fs/xfs/scrub/parent_repair.c | 37 ++- fs/xfs/scrub/quotacheck.c | 7 - fs/xfs/scrub/refcount_repair.c | 2 fs/xfs/scrub/repair.c | 22 +- fs/xfs/scrub/repair.h | 3 fs/xfs/scrub/scrub.c | 12 + fs/xfs/scrub/scrub.h | 2 fs/xfs/scrub/stats.c | 1 fs/xfs/scrub/tempfile.c | 105 ++++++++ fs/xfs/scrub/tempfile.h | 3 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 42 +++ fs/xfs/xfs_dquot.c | 1 fs/xfs/xfs_fsops.c | 4 fs/xfs/xfs_health.c | 2 fs/xfs/xfs_icache.c | 74 ++++++ fs/xfs/xfs_inode.c | 19 + fs/xfs/xfs_inode.h | 36 ++- fs/xfs/xfs_inode_item.c | 7 - fs/xfs/xfs_inode_item_recover.c | 2 fs/xfs/xfs_ioctl.c | 7 + fs/xfs/xfs_iops.c | 15 + fs/xfs/xfs_itable.c | 33 ++ fs/xfs/xfs_itable.h | 3 fs/xfs/xfs_message.c | 47 ++++ fs/xfs/xfs_message.h | 19 + fs/xfs/xfs_mount.c | 31 ++ fs/xfs/xfs_mount.h | 11 + fs/xfs/xfs_qm.c | 36 +++ fs/xfs/xfs_quota.h | 5 fs/xfs/xfs_rtalloc.c | 38 ++- fs/xfs/xfs_super.c | 13 - fs/xfs/xfs_trace.c | 2 fs/xfs/xfs_trace.h | 102 ++++++++ fs/xfs/xfs_trans_dquot.c | 6 fs/xfs/xfs_xattr.c | 3 71 files changed, 2324 insertions(+), 201 deletions(-) create mode 100644 fs/xfs/libxfs/xfs_metadir.c create mode 100644 fs/xfs/libxfs/xfs_metadir.h create mode 100644 fs/xfs/libxfs/xfs_metafile.c create mode 100644 fs/xfs/libxfs/xfs_metafile.h create mode 100644 fs/xfs/scrub/metapath.c

1 year, 1 month

3
5
0 0

[merged mm-hotfixes-stable] mm-avoid-unconditional-one-tick-sleep-when-swapcache_prepare-fails.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: avoid unconditional one-tick sleep when swapcache_prepare fails has been removed from the -mm tree. Its filename was mm-avoid-unconditional-one-tick-sleep-when-swapcache_prepare-fails.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Barry Song <v-songbaohua(a)oppo.com> Subject: mm: avoid unconditional one-tick sleep when swapcache_prepare fails Date: Fri, 27 Sep 2024 09:19:36 +1200 Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") introduced an unconditional one-tick sleep when `swapcache_prepare()` fails, which has led to reports of UI stuttering on latency-sensitive Android devices. To address this, we can use a waitqueue to wake up tasks that fail `swapcache_prepare()` sooner, instead of always sleeping for a full tick. While tasks may occasionally be woken by an unrelated `do_swap_page()`, this method is preferable to two scenarios: rapid re-entry into page faults, which can cause livelocks, and multiple millisecond sleeps, which visibly degrade user experience. Oven's testing shows that a single waitqueue resolves the UI stuttering issue. If a 'thundering herd' problem becomes apparent later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE]` for page bit locks can be introduced. [v-songbaohua(a)oppo.com: wake_up only when swapcache_wq waitqueue is active] Link: https://lkml.kernel.org/r/20241008130807.40833-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240926211936.75373-1-21cnbao@gmail.com Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") Signed-off-by: Barry Song <v-songbaohua(a)oppo.com> Reported-by: Oven Liyang <liyangouwen1(a)oppo.com> Tested-by: Oven Liyang <liyangouwen1(a)oppo.com> Cc: Kairui Song <kasong(a)tencent.com> Cc: "Huang, Ying" <ying.huang(a)intel.com> Cc: Yu Zhao <yuzhao(a)google.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Chris Li <chrisl(a)kernel.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Minchan Kim <minchan(a)kernel.org> Cc: Yosry Ahmed <yosryahmed(a)google.com> Cc: SeongJae Park <sj(a)kernel.org> Cc: Kalesh Singh <kaleshsingh(a)google.com> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memory.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) --- a/mm/memory.c~mm-avoid-unconditional-one-tick-sleep-when-swapcache_prepare-fails +++ a/mm/memory.c @@ -4187,6 +4187,8 @@ static struct folio *alloc_swap_folio(st } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq); + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -4199,6 +4201,7 @@ vm_fault_t do_swap_page(struct vm_fault { struct vm_area_struct *vma = vmf->vma; struct folio *swapcache, *folio = NULL; + DECLARE_WAITQUEUE(wait, current); struct page *page; struct swap_info_struct *si = NULL; rmap_t rmap_flags = RMAP_NONE; @@ -4297,7 +4300,9 @@ vm_fault_t do_swap_page(struct vm_fault * Relax a bit to prevent rapid * repeated page faults. */ + add_wait_queue(&swapcache_wq, &wait); schedule_timeout_uninterruptible(1); + remove_wait_queue(&swapcache_wq, &wait); goto out_page; } need_clear_cache = true; @@ -4604,8 +4609,11 @@ unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); out: /* Clear the swap cache pin for direct swapin after PTL unlock */ - if (need_clear_cache) + if (need_clear_cache) { swapcache_clear(si, entry, nr_pages); + if (waitqueue_active(&swapcache_wq)) + wake_up(&swapcache_wq); + } if (si) put_swap_device(si); return ret; @@ -4620,8 +4628,11 @@ out_release: folio_unlock(swapcache); folio_put(swapcache); } - if (need_clear_cache) + if (need_clear_cache) { swapcache_clear(si, entry, nr_pages); + if (waitqueue_active(&swapcache_wq)) + wake_up(&swapcache_wq); + } if (si) put_swap_device(si); return ret; _ Patches currently in -mm which might be from v-songbaohua(a)oppo.com are mm-fix-pswpin-counter-for-large-folios-swap-in.patch

1 year, 1 month

1
0
0 0

+ mm-mmap-limit-thp-aligment-of-anonymous-mappings-to-pmd-aligned-sizes.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm, mmap: limit THP aligment of anonymous mappings to PMD-aligned sizes has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-mmap-limit-thp-aligment-of-anonymous-mappings-to-pmd-aligned-sizes.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Vlastimil Babka <vbabka(a)suse.cz> Subject: mm, mmap: limit THP aligment of anonymous mappings to PMD-aligned sizes Date: Thu, 24 Oct 2024 17:12:29 +0200 Since commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") a mmap() of anonymous memory without a specific address hint and of at least PMD_SIZE will be aligned to PMD so that it can benefit from a THP backing page. However this change has been shown to regress some workloads significantly. [1] reports regressions in various spec benchmarks, with up to 600% slowdown of the cactusBSSN benchmark on some platforms. The benchmark seems to create many mappings of 4632kB, which would have merged to a large THP-backed area before commit efa7df3e3bb5 and now they are fragmented to multiple areas each aligned to PMD boundary with gaps between. The regression then seems to be caused mainly due to the benchmark's memory access pattern suffering from TLB or cache aliasing due to the aligned boundaries of the individual areas. Another known regression bisected to commit efa7df3e3bb5 is darktable [2] [3] and early testing suggests this patch fixes the regression there as well. To fix the regression but still try to benefit from THP-friendly anonymous mapping alignment, add a condition that the size of the mapping must be a multiple of PMD size instead of at least PMD size. In case of many odd-sized mapping like the cactusBSSN creates, those will stop being aligned and with gaps between, and instead naturally merge again. Link: https://lkml.kernel.org/r/20241024151228.101841-2-vbabka@suse.cz Fixes: efa7df3e3bb5 ("mm: align larger anonymous mappings on THP boundaries") Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz> Reported-by: Michael Matz <matz(a)suse.de> Debugged-by: Gabriel Krisman Bertazi <gabriel(a)krisman.be> Closes: https://bugzilla.suse.com/show_bug.cgi?id=1229012 [1] Reported-by: Matthias Bodenbinder <matthias(a)bodenbinder.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219366 [2] Closes: https://lore.kernel.org/all/2050f0d4-57b0-481d-bab8-05e8d48fed0c@leemhuis.i… [3] Cc: Rik van Riel <riel(a)surriel.com> Cc: Yang Shi <yang(a)os.amperecomputing.com> Cc: Jann Horn <jannh(a)google.com> Cc: Liam R. Howlett <Liam.Howlett(a)Oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Cc: Petr Tesarik <ptesarik(a)suse.com> Cc: Thorsten Leemhuis <regressions(a)leemhuis.info> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/mmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/mmap.c~mm-mmap-limit-thp-aligment-of-anonymous-mappings-to-pmd-aligned-sizes +++ a/mm/mmap.c @@ -900,7 +900,8 @@ __get_unmapped_area(struct file *file, u if (get_area) { addr = get_area(file, addr, len, pgoff, flags); - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) + && IS_ALIGNED(len, PMD_SIZE)) { /* Ensures that larger anonymous mappings are THP aligned. */ addr = thp_get_unmapped_area_vmflags(file, addr, len, pgoff, flags, vm_flags); _ Patches currently in -mm which might be from vbabka(a)suse.cz are mm-mmap-limit-thp-aligment-of-anonymous-mappings-to-pmd-aligned-sizes.patch

1 year, 1 month

1
0
0 0

+ mm-shrinker-avoid-memleak-in-alloc_shrinker_info.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm: shrinker: avoid memleak in alloc_shrinker_info has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-shrinker-avoid-memleak-in-alloc_shrinker_info.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Chen Ridong <chenridong(a)huawei.com> Subject: mm: shrinker: avoid memleak in alloc_shrinker_info Date: Fri, 25 Oct 2024 06:09:42 +0000 A memleak was found as below: unreferenced object 0xffff8881010d2a80 (size 32): comm "mkdir", pid 1559, jiffies 4294932666 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... backtrace (crc 2e7ef6fa): [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 [<ffffffff81198dd9>] online_css+0x29/0xa0 [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e alloc_shrinker_info(), when shrinker_unit_alloc() returns an errer, the info won't be freed. Just fix it. Link: https://lkml.kernel.org/r/20241025060942.1049263-1-chenridong@huaweicloud.c… Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") Signed-off-by: Chen Ridong <chenridong(a)huawei.com> Acked-by: Qi Zheng <zhengqi.arch(a)bytedance.com> Acked-by: Roman Gushchin <roman.gushchin(a)linux.dev> Acked-by: Vlastimil Babka <vbabka(a)suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com> Cc: Dave Chinner <david(a)fromorbit.com> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Wang Weiyang <wangweiyang2(a)huawei.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/shrinker.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) --- a/mm/shrinker.c~mm-shrinker-avoid-memleak-in-alloc_shrinker_info +++ a/mm/shrinker.c @@ -76,19 +76,21 @@ void free_shrinker_info(struct mem_cgrou int alloc_shrinker_info(struct mem_cgroup *memcg) { - struct shrinker_info *info; int nid, ret = 0; int array_size = 0; mutex_lock(&shrinker_mutex); array_size = shrinker_unit_size(shrinker_nr_max); for_each_node(nid) { - info = kvzalloc_node(sizeof(*info) + array_size, GFP_KERNEL, nid); + struct shrinker_info *info = kvzalloc_node(sizeof(*info) + array_size, + GFP_KERNEL, nid); if (!info) goto err; info->map_nr_max = shrinker_nr_max; - if (shrinker_unit_alloc(info, NULL, nid)) + if (shrinker_unit_alloc(info, NULL, nid)) { + kvfree(info); goto err; + } rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex); _ Patches currently in -mm which might be from chenridong(a)huawei.com are mm-shrinker-avoid-memleak-in-alloc_shrinker_info.patch

1 year, 1 month

1
0
0 0

[PATCH 0/2] usb: dwc3: Disable susphy during initialization

by Thinh Nguyen

We notice some platforms set "snps,dis_u3_susphy_quirk" and "snps,dis_u2_susphy_quirk" when they should not need to. Just make sure that the GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are clear during initialization. The host initialization involved xhci. So the dwc3 needs to implement the xhci_plat_priv->plat_start() for xhci to re-enable the suspend bits. Since there's a prerequisite patch to drivers/usb/host/xhci-plat.h that's not a fix patch, this series should go on Greg's usb-testing branch instead of usb-linus. Thinh Nguyen (2): usb: xhci-plat: Don't include xhci.h usb: dwc3: core: Prevent phy suspend during init drivers/usb/dwc3/core.c | 90 +++++++++++++++--------------------- drivers/usb/dwc3/core.h | 1 + drivers/usb/dwc3/gadget.c | 2 + drivers/usb/dwc3/host.c | 27 +++++++++++ drivers/usb/host/xhci-plat.h | 4 +- 5 files changed, 71 insertions(+), 53 deletions(-) base-commit: 3d122e6d27e417a9fa91181922743df26b2cd679 -- 2.28.0

1 year, 1 month

5
12
0 0

+ vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: vmscan,migrate: fix double-decrement on node stats when demoting pages has been added to the -mm mm-hotfixes-unstable branch. Its filename is vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Gregory Price <gourry(a)gourry.net> Subject: vmscan,migrate: fix double-decrement on node stats when demoting pages Date: Fri, 25 Oct 2024 10:17:24 -0400 When numa balancing is enabled with demotion, vmscan will call migrate_pages when shrinking LRUs. Successful demotions will cause node vmstat numbers to double-decrement, leading to an imbalanced page count. The result is dmesg output like such: $ cat /proc/sys/vm/stat_refresh [77383.088417] vmstat_refresh: nr_isolated_anon -103212 [77383.088417] vmstat_refresh: nr_isolated_file -899642 This negative value may impact compaction and reclaim throttling. The double-decrement occurs in the migrate_pages path: caller to shrink_folio_list decrements the count shrink_folio_list demote_folio_list migrate_pages migrate_pages_batch migrate_folio_move migrate_folio_done mod_node_page_state(-ve) <- second decrement This path happens for SUCCESSFUL migrations, not failures. Typically callers to migrate_pages are required to handle putback/accounting for failures, but this is already handled in the shrink code. When accounting for migrations, instead do not decrement the count when the migration reason is MR_DEMOTION. As of v6.11, this demotion logic is the only source of MR_DEMOTION. Link: https://lkml.kernel.org/r/20241025141724.17927-1-gourry@gourry.net Fixes: 26aa2d199d6f2 ("mm/migrate: demote pages during reclaim") Signed-off-by: Gregory Price <gourry(a)gourry.net> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: Huang Ying <ying.huang(a)intel.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Wei Xu <weixugc(a)google.com> Cc: Yang Shi <shy828301(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/migrate.c~vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages +++ a/mm/migrate.c @@ -1178,7 +1178,7 @@ static void migrate_folio_done(struct fo * not accounted to NR_ISOLATED_*. They can be recognized * as __folio_test_movable */ - if (likely(!__folio_test_movable(src))) + if (likely(!__folio_test_movable(src)) && reason != MR_DEMOTION) mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON + folio_is_file_lru(src), -folio_nr_pages(src)); _ Patches currently in -mm which might be from gourry(a)gourry.net are vmscanmigrate-fix-double-decrement-on-node-stats-when-demoting-pages.patch

1 year, 1 month

1
0
0 0

[PATCH v2 1/4] KVM: arm64: Don't retire aborted MMIO instruction

by Oliver Upton

Returning an abort to the guest for an unsupported MMIO access is a documented feature of the KVM UAPI. Nevertheless, it's clear that this plumbing has seen limited testing, since userspace can trivially cause a WARN in the MMIO return: WARNING: CPU: 0 PID: 30558 at arch/arm64/include/asm/kvm_emulate.h:536 kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536 Call trace: kvm_handle_mmio_return+0x46c/0x5c4 arch/arm64/include/asm/kvm_emulate.h:536 kvm_arch_vcpu_ioctl_run+0x98/0x15b4 arch/arm64/kvm/arm.c:1133 kvm_vcpu_ioctl+0x75c/0xa78 virt/kvm/kvm_main.c:4487 __do_sys_ioctl fs/ioctl.c:51 [inline] __se_sys_ioctl fs/ioctl.c:893 [inline] __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:893 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x1e0/0x23c arch/arm64/kernel/syscall.c:132 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151 el0_svc+0x38/0x68 arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x90/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 The splat is complaining that KVM is advancing PC while an exception is pending, i.e. that KVM is retiring the MMIO instruction despite a pending synchronous external abort. Womp womp. Fix the glaring UAPI bug by skipping over all the MMIO emulation in case there is a pending synchronous exception. Note that while userspace is capable of pending an asynchronous exception (SError, IRQ, or FIQ), it is still safe to retire the MMIO instruction in this case as (1) they are by definition asynchronous, and (2) KVM relies on hardware support for pending/delivering these exceptions instead of the software state machine for advancing PC. Cc: stable(a)vger.kernel.org Fixes: da345174ceca ("KVM: arm/arm64: Allow user injection of external data aborts") Reported-by: Alexander Potapenko <glider(a)google.com> Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev> --- arch/arm64/kvm/mmio.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c index cd6b7b83e2c3..ab365e839874 100644 --- a/arch/arm64/kvm/mmio.c +++ b/arch/arm64/kvm/mmio.c @@ -72,6 +72,31 @@ unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len) return data; } +static bool kvm_pending_sync_exception(struct kvm_vcpu *vcpu) +{ + if (!vcpu_get_flag(vcpu, PENDING_EXCEPTION)) + return false; + + if (vcpu_el1_is_32bit(vcpu)) { + switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) { + case unpack_vcpu_flag(EXCEPT_AA32_UND): + case unpack_vcpu_flag(EXCEPT_AA32_IABT): + case unpack_vcpu_flag(EXCEPT_AA32_DABT): + return true; + default: + return false; + } + } else { + switch (vcpu_get_flag(vcpu, EXCEPT_MASK)) { + case unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC): + case unpack_vcpu_flag(EXCEPT_AA64_EL2_SYNC): + return true; + default: + return false; + } + } +} + /** * kvm_handle_mmio_return -- Handle MMIO loads after user space emulation * or in-kernel IO emulation @@ -84,8 +109,11 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu) unsigned int len; int mask; - /* Detect an already handled MMIO return */ - if (unlikely(!vcpu->mmio_needed)) + /* + * Detect if the MMIO return was already handled or if userspace aborted + * the MMIO access. + */ + if (unlikely(!vcpu->mmio_needed || kvm_pending_sync_exception(vcpu))) return 1; vcpu->mmio_needed = 0; -- 2.47.0.163.g1226f6d8fa-goog

1 year, 1 month

1
0
0 0

+ sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: sched/numa: fix the potential null pointer dereference in task_numa_work() has been added to the -mm mm-hotfixes-unstable branch. Its filename is sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Shawn Wang <shawnwang(a)linux.alibaba.com> Subject: sched/numa: fix the potential null pointer dereference in task_numa_work() Date: Fri, 25 Oct 2024 10:22:08 +0800 When running stress-ng-vm-segv test, we found a null pointer dereference error in task_numa_work(). Here is the backtrace: [323676.066985] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ...... [323676.067108] CPU: 35 PID: 2694524 Comm: stress-ng-vm-se ...... [323676.067113] pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--) [323676.067115] pc : vma_migratable+0x1c/0xd0 [323676.067122] lr : task_numa_work+0x1ec/0x4e0 [323676.067127] sp : ffff8000ada73d20 [323676.067128] x29: ffff8000ada73d20 x28: 0000000000000000 x27: 000000003e89f010 [323676.067130] x26: 0000000000080000 x25: ffff800081b5c0d8 x24: ffff800081b27000 [323676.067133] x23: 0000000000010000 x22: 0000000104d18cc0 x21: ffff0009f7158000 [323676.067135] x20: 0000000000000000 x19: 0000000000000000 x18: ffff8000ada73db8 [323676.067138] x17: 0001400000000000 x16: ffff800080df40b0 x15: 0000000000000035 [323676.067140] x14: ffff8000ada73cc8 x13: 1fffe0017cc72001 x12: ffff8000ada73cc8 [323676.067142] x11: ffff80008001160c x10: ffff000be639000c x9 : ffff8000800f4ba4 [323676.067145] x8 : ffff000810375000 x7 : ffff8000ada73974 x6 : 0000000000000001 [323676.067147] x5 : 0068000b33e26707 x4 : 0000000000000001 x3 : ffff0009f7158000 [323676.067149] x2 : 0000000000000041 x1 : 0000000000004400 x0 : 0000000000000000 [323676.067152] Call trace: [323676.067153] vma_migratable+0x1c/0xd0 [323676.067155] task_numa_work+0x1ec/0x4e0 [323676.067157] task_work_run+0x78/0xd8 [323676.067161] do_notify_resume+0x1ec/0x290 [323676.067163] el0_svc+0x150/0x160 [323676.067167] el0t_64_sync_handler+0xf8/0x128 [323676.067170] el0t_64_sync+0x17c/0x180 [323676.067173] Code: d2888001 910003fd f9000bf3 aa0003f3 (f9401000) [323676.067177] SMP: stopping secondary CPUs [323676.070184] Starting crashdump kernel... stress-ng-vm-segv in stress-ng is used to stress test the SIGSEGV error handling function of the system, which tries to cause a SIGSEGV error on return from unmapping the whole address space of the child process. Normally this program will not cause kernel crashes. But before the munmap system call returns to user mode, a potential task_numa_work() for numa balancing could be added and executed. In this scenario, since the child process has no vma after munmap, the vma_next() in task_numa_work() will return a null pointer even if the vma iterator restarts from 0. Recheck the vma pointer before dereferencing it in task_numa_work(). Link: https://lkml.kernel.org/r/20241025022208.125527-1-shawnwang@linux.alibaba.c… Fixes: 214dbc428137 ("sched: convert to vma iterator") Signed-off-by: Shawn Wang <shawnwang(a)linux.alibaba.com> Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Ben Segall <bsegall(a)google.com> Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Juri Lelli <juri.lelli(a)redhat.com> Cc: Mel Gorman <mgorman(a)suse.de> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Steven Rostedt (Google) <rostedt(a)goodmis.org> Cc: Valentin Schneider <vschneid(a)redhat.com> Cc: Vincent Guittot <vincent.guittot(a)linaro.org> Cc: <stable(a)vger.kernel.org> [6.2+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/kernel/sched/fair.c~sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work +++ a/kernel/sched/fair.c @@ -3369,7 +3369,7 @@ retry_pids: vma = vma_next(&vmi); } - do { + for (; vma; vma = vma_next(&vmi)) { if (!vma_migratable(vma) || !vma_policy_mof(vma) || is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_MIXEDMAP)) { trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_UNSUITABLE); @@ -3491,7 +3491,7 @@ retry_pids: */ if (vma_pids_forced) break; - } for_each_vma(vmi, vma); + } /* * If no VMAs are remaining and VMAs were skipped due to the PID _ Patches currently in -mm which might be from shawnwang(a)linux.alibaba.com are sched-numa-fix-the-potential-null-pointer-dereference-in-task_numa_work.patch

1 year, 1 month

1
0
0 0

[PATCH] pinctrl: qcom: spmi: fix debugfs drive strength

by Johan Hovold

Commit 723e8462a4fe ("pinctrl: qcom: spmi-gpio: Fix the GPIO strength mapping") fixed a long-standing issue in the Qualcomm SPMI PMIC gpio driver which had the 'low' and 'high' drive strength settings switched but failed to update the debugfs interface which still gets this wrong. Fix the debugfs code so that the exported values match the hardware settings. Note that this probably means that most devicetrees that try to describe the firmware settings got this wrong if the settings were derived from debugfs. Before the above mentioned commit the settings would have actually matched the firmware settings even if they were described incorrectly, but now they are inverted. Fixes: 723e8462a4fe ("pinctrl: qcom: spmi-gpio: Fix the GPIO strength mapping") Fixes: eadff3024472 ("pinctrl: Qualcomm SPMI PMIC GPIO pin controller driver") Cc: Anjelique Melendez <quic_amelende(a)quicinc.com> Cc: stable(a)vger.kernel.org # 3.19 Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c index 3d03293f6320..3a12304e2b7d 100644 --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c @@ -667,7 +667,7 @@ static void pmic_gpio_config_dbg_show(struct pinctrl_dev *pctldev, "push-pull", "open-drain", "open-source" }; static const char *const strengths[] = { - "no", "high", "medium", "low" + "no", "low", "medium", "high" }; pad = pctldev->desc->pins[pin].drv_data; -- 2.45.2

1 year, 1 month

3
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror October 2024