Re: [PATCH v12 1/9] hugetlb_cgroup: Add hugetlb_cgroup reservation counter

18 Feb 2020


      On Tue, 2020-02-18 at 10:35 -0800, Mina Almasry wrote:
...
On Tue, Feb 18, 2020 at 6:21 AM Qian Cai cai@lca.pw wrote:
...
On Tue, 2020-02-11 at 15:19 -0800, Andrew Morton wrote:
...
On Tue, 11 Feb 2020 13:31:20 -0800 Mina Almasry almasrymina@google.com wrote:
...
These counters will track hugetlb reservations rather than hugetlb
memory faulted in. This patch only adds the counter, following patches
add the charging and uncharging of the counter.
We're still pretty thin on review here, but as it's v12 and Mike
appears to be signed up to look at this work, I'll add them to -next to
help move things forward.
Reverted the whole series on the top of next-20200217 fixed a crash below (I
don't see anything in next-20200218 would make any differences).
[ 7933.691114][T35046] LTP: starting hugemmap06
[ 7933.806377][T14355] ------------[ cut here ]------------
[ 7933.806541][T14355] kernel BUG at mm/hugetlb.c:490!
VM_BUG_ON(t - f <= 1);
[ 7933.806562][T14355] Oops: Exception in kernel mode, sig: 5 [#1]
[ 7933.806573][T14355] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
DEBUG_PAGEALLOC NUMA PowerNV
[ 7933.806594][T14355] Modules linked in: kvm_hv kvm brd ext4 crc16 mbcache jbd2
loop ip_tables x_tables xfs sd_mod bnx2x ahci mdio libahci tg3 libata libphy
firmware_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
binfmt_misc]
[ 7933.806651][T14355] CPU: 54 PID: 14355 Comm: hugemmap06 Tainted:
G           O      5.6.0-rc2-next-20200217 #1
[ 7933.806674][T14355] NIP:  c00000000040d22c LR: c00000000040d210 CTR:
0000000000000000
[ 7933.806696][T14355] REGS: c0000014b71ef660 TRAP: 0700   Tainted:
G           O       (5.6.0-rc2-next-20200217)
[ 7933.806727][T14355] MSR:  900000000282b033
<SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 22022228  XER: 00000000
[ 7933.806772][T14355] CFAR: c00000000040cbec IRQMASK: 0
[ 7933.806772][T14355] GPR00: c00000000040d210 c0000014b71ef8f0 c000000001657000
0000000000000001
[ 7933.806772][T14355] GPR04: 0000000000000012 0000000000000013 0000000000000000
0000000000000000
[ 7933.806772][T14355] GPR08: 0000000000000002 0000000000000002 0000000000000001
0000000000000036
[ 7933.806772][T14355] GPR12: 0000000022022222 c000001ffffd3d00 00007fffad670000
00007fffa4bc0000
[ 7933.806772][T14355] GPR16: 0000000000000000 c000000001567178 c0000014b71efa50
0000000000000000
[ 7933.806772][T14355] GPR20: 0000000000000000 0000000000000013 0000000000000012
0000000000000001
[ 7933.806772][T14355] GPR24: c0000019f74cd270 5deadbeef0000100 5deadbeef0000122
c0000019f74cd2c0
[ 7933.806772][T14355] GPR28: 0000000000000001 c0000019f74cd268 c0000014b71ef918
0000000000000001
[ 7933.806961][T14355] NIP [c00000000040d22c] region_add+0x11c/0x3a0
[ 7933.806980][T14355] LR [c00000000040d210] region_add+0x100/0x3a0
[ 7933.807008][T14355] Call Trace:
[ 7933.807024][T14355] [c0000014b71ef8f0] [c00000000040d210]
region_add+0x100/0x3a0 (unreliable)
[ 7933.807056][T14355] [c0000014b71ef9b0] [c00000000040e0c8]
__vma_reservation_common+0x148/0x210
__vma_reservation_common at mm/hugetlb.c:2150
[ 7933.807087][T14355] [c0000014b71efa20] [c0000000004132a0]
alloc_huge_page+0x350/0x830
alloc_huge_page at mm/hugetlb.c:2359
[ 7933.807100][T14355] [c0000014b71efad0] [c0000000004168f8]
hugetlb_no_page+0x158/0xcb0
[ 7933.807113][T14355] [c0000014b71efc20] [c000000000417bc8]
hugetlb_fault+0x678/0xb30
[ 7933.807136][T14355] [c0000014b71efcd0] [c0000000003b1de4]
handle_mm_fault+0x444/0x450
[ 7933.807158][T14355] [c0000014b71efd20] [c000000000070b1c]
__do_page_fault+0x2bc/0xfd0
[ 7933.807181][T14355] [c0000014b71efe20] [c00000000000aa88]
handle_page_fault+0x10/0x30
[ 7933.807201][T14355] Instruction dump:
[ 7933.807209][T14355] 38c00000 7ea5ab78 7ec4b378 7fa3eb78 4bfff80d e9210020
e91d0050 e95d0068
[ 7933.807232][T14355] 7d3c4850 7d294214 7faa4800 409c0238 <0b170000> 7f03c378
4858c005 60000000
[ 7933.807267][T14355] ---[ end trace 7560275de5f409f8 ]---
[ 7933.905258][T14355]
[ 7934.905339][T14355] Kernel panic - not syncing: Fatal exception
Hi Qian,
Yes this VM_BUG_ON was added by a patch in the series ("hugetlb:
disable region_add file_region coalescing") so it's definitely related
to the series. I'm taking a look at why this VM_BUG_ON fires. Can you
confirm you reproduce this by running hugemmap06 from the ltp on a
powerpc machine? Can I maybe have your config?
Yes, reproduced on both powerpc and x86. Configs are in,
https://github.com/cailca/linux-mm

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v12 1/9] hugetlb_cgroup: Add hugetlb_cgroup reservation counter