From: Lang Yu <lang.yu(a)amd.com>
Subject: mm/kmemleak: avoid scanning potential huge holes
When using devm_request_free_mem_region() and devm_memremap_pages() to add
ZONE_DEVICE memory, if requested free mem region's end pfn were huge(e.g.,
0x400000000), the node_end_pfn() will be also huge (see
move_pfn_range_to_zone()). Thus it creates a huge hole between
node_start_pfn() and node_end_pfn().
We found on some AMD APUs, amdkfd requested such a free mem region and
created a huge hole. In such a case, following code snippet was just
doing busy test_bit() looping on the huge hole.
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
if (!page)
continue;
...
}
So we got a soft lockup:
watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
RIP: 0010:pfn_to_online_page+0x5/0xd0
Call Trace:
? kmemleak_scan+0x16a/0x440
kmemleak_write+0x306/0x3a0
? common_file_perm+0x72/0x170
full_proxy_write+0x5c/0x90
vfs_write+0xb9/0x260
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
I did some tests with the patch.
(1) amdgpu module unloaded
before the patch:
real 0m0.976s
user 0m0.000s
sys 0m0.968s
after the patch:
real 0m0.981s
user 0m0.000s
sys 0m0.973s
(2) amdgpu module loaded
before the patch:
real 0m35.365s
user 0m0.000s
sys 0m35.354s
after the patch:
real 0m1.049s
user 0m0.000s
sys 0m1.042s
Link: https://lkml.kernel.org/r/20211108140029.721144-1-lang.yu@amd.com
Signed-off-by: Lang Yu <lang.yu(a)amd.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kmemleak.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/mm/kmemleak.c~mm-kmemleak-avoid-scanning-potential-huge-holes
+++ a/mm/kmemleak.c
@@ -1410,7 +1410,8 @@ static void kmemleak_scan(void)
{
unsigned long flags;
struct kmemleak_object *object;
- int i;
+ struct zone *zone;
+ int __maybe_unused i;
int new_leaks = 0;
jiffies_last_scan = jiffies;
@@ -1450,9 +1451,9 @@ static void kmemleak_scan(void)
* Struct page scanning for each node.
*/
get_online_mems();
- for_each_online_node(i) {
- unsigned long start_pfn = node_start_pfn(i);
- unsigned long end_pfn = node_end_pfn(i);
+ for_each_populated_zone(zone) {
+ unsigned long start_pfn = zone->zone_start_pfn;
+ unsigned long end_pfn = zone_end_pfn(zone);
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
@@ -1461,8 +1462,8 @@ static void kmemleak_scan(void)
if (!page)
continue;
- /* only scan pages belonging to this node */
- if (page_to_nid(page) != i)
+ /* only scan pages belonging to this zone */
+ if (page_zone(page) != zone)
continue;
/* only scan if page is in use */
if (page_count(page) == 0)
_
From: Mike Rapoport <rppt(a)linux.ibm.com>
Subject: mm/pgtable: define pte_index so that preprocessor could recognize it
Since commit 974b9b2c68f3 ("mm: consolidate pte_index() and pte_offset_*()
definitions") pte_index is a static inline and there is no define for it
that can be recognized by the preprocessor. As a result,
vm_insert_pages() uses slower loop over vm_insert_page() instead of
insert_pages() that amortizes the cost of spinlock operations when
inserting multiple pages.
Link: https://lkml.kernel.org/r/20220111145457.20748-1-rppt@kernel.org
Fixes: 974b9b2c68f3 ("mm: consolidate pte_index() and pte_offset_*() definitions")
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reported-by: Christian Dietrich <stettberger(a)dokucode.de>
Reviewed-by: Khalid Aziz <khalid.aziz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/pgtable.h | 1 +
1 file changed, 1 insertion(+)
--- a/include/linux/pgtable.h~mm-pgtable-define-pte_index-so-that-preprocessor-could-recognize-it
+++ a/include/linux/pgtable.h
@@ -62,6 +62,7 @@ static inline unsigned long pte_index(un
{
return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
}
+#define pte_index pte_index
#ifndef pmd_index
static inline unsigned long pmd_index(unsigned long address)
_
From: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Subject: mm/debug_vm_pgtable: remove pte entry from the page table
Patch series "page table check fixes and cleanups", v5.
This patch (of 4):
The pte entry that is used in pte_advanced_tests() is never removed from
the page table at the end of the test.
The issue is detected by page_table_check, to repro compile kernel with
the following configs:
CONFIG_DEBUG_VM_PGTABLE=y
CONFIG_PAGE_TABLE_CHECK=y
CONFIG_PAGE_TABLE_CHECK_ENFORCED=y
During the boot the following BUG is printed:
[ 2.262821] debug_vm_pgtable: [debug_vm_pgtable ]: Validating
architecture page table helpers
[ 2.276826] ------------[ cut here ]------------
[ 2.280426] kernel BUG at mm/page_table_check.c:162!
[ 2.284118] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 2.287787] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
5.16.0-11413-g2c271fe77d52 #3
[ 2.293226] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org
04/01/2014
...
The entry should be properly removed from the page table before the page
is released to the free list.
Link: https://lkml.kernel.org/r/20220131203249.2832273-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20220131203249.2832273-2-pasha.tatashin@soleen.com
Fixes: a5c3b9ffb0f4 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
Signed-off-by: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Tested-by: Zi Yan <ziy(a)nvidia.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Paul Turner <pjt(a)google.com>
Cc: Wei Xu <weixugc(a)google.com>
Cc: Greg Thelen <gthelen(a)google.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/debug_vm_pgtable.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-remove-pte-entry-from-the-page-table
+++ a/mm/debug_vm_pgtable.c
@@ -171,6 +171,8 @@ static void __init pte_advanced_tests(st
ptep_test_and_clear_young(args->vma, args->vaddr, args->ptep);
pte = ptep_get(args->ptep);
WARN_ON(pte_young(pte));
+
+ ptep_get_and_clear_full(args->mm, args->vaddr, args->ptep, 1);
}
static void __init pte_savedwrite_tests(struct pgtable_debug_args *args)
_
Commit 309a62fa3a9e ("bio-integrity: bio_integrity_advance must update
integrity seed") added code to update the integrity seed value when
advancing a bio. However, it failed to take into account that the
integrity interval might be larger than the 512-byte block layer
sector size. This broke bio splitting on PI devices with 4KB logical
blocks.
The seed value should be advanced by bio_integrity_intervals() and not
the number of sectors.
Cc: Dmitry Monakhov <dmonakhov(a)openvz.org>
Cc: stable(a)vger.kernel.org
Fixes: 309a62fa3a9e ("bio-integrity: bio_integrity_advance must update integrity seed")
Tested-by: Dmitry Ivanov <dmitry.ivanov2(a)hpe.com>
Reported-by: Alexey Lyashkov <alexey.lyashkov(a)hpe.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
---
block/bio-integrity.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index d25114715459..0827b19820c5 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -373,7 +373,7 @@ void bio_integrity_advance(struct bio *bio, unsigned int bytes_done)
struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
unsigned bytes = bio_integrity_bytes(bi, bytes_done >> 9);
- bip->bip_iter.bi_sector += bytes_done >> 9;
+ bip->bip_iter.bi_sector += bio_integrity_intervals(bi, bytes_done >> 9);
bvec_iter_advance(bip->bip_vec, &bip->bip_iter, bytes);
}
--
2.32.0
As Pavel Machek reported in below link [1]:
After commit 77900c45ee5c ("f2fs: fix to do sanity check in is_alive()"),
node page should be unlock via calling f2fs_put_page() in the error path
of is_alive(), otherwise, f2fs may hang when it tries to lock the node
page, fix it.
[1] https://lore.kernel.org/stable/20220124203637.GA19321@duo.ucw.cz/
Fixes: 77900c45ee5c ("f2fs: fix to do sanity check in is_alive()")
Cc: <stable(a)vger.kernel.org>
Reported-by: Pavel Machek <pavel(a)denx.de>
Signed-off-by: Pavel Machek <pavel(a)denx.de>
Signed-off-by: Chao Yu <chao(a)kernel.org>
---
v2:
- Cc stable-kernel mailing list.
fs/f2fs/gc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 0a6b0a8ae97e..2d53ef121e76 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1038,8 +1038,10 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
set_sbi_flag(sbi, SBI_NEED_FSCK);
}
- if (f2fs_check_nid_range(sbi, dni->ino))
+ if (f2fs_check_nid_range(sbi, dni->ino)) {
+ f2fs_put_page(node_page, 1);
return false;
+ }
*nofs = ofs_of_node(node_page);
source_blkaddr = data_blkaddr(NULL, node_page, ofs_in_node);
--
2.32.0
I’m Mr. Mathieu a banker working with a commercial bank as a Chief
Financial Officer. I have a business deal for our mutual benefit. One
of our customers died few years ago without any registered next-of-kin
in his bio-data/account file which is peculiar with high profile
investors. I think he was your relative because you both share the
same family's name. Right now my bank wants to sit on this money (US$
13 .5 Million) as unclaimed and abandoned funds which I do not want
because of reasons I will explain to you when I receive your response.
Please get back to me as fast as you can with your mobile telephone
number for more details on how the funds can be turned into yours by
right of inheritance for us to claim and share equally afterwards;
Best Regards, I am urgently expecting your reply.
Yours,
Mr. Mathieu