Steve Wahl steve.wahl@hpe.com writes:
On Thu, Mar 28, 2024 at 12:05:02AM -0500, Eric W. Biederman wrote:
From my perspective the entire reason for wanting to be fine grained and precise in the kernel memory map is because the UV systems don't have enough MTRRs. So you have to depend upon the cache-ability attributes for specific addresses of memory coming from the page tables instead of from the MTRRs.
It would be more accurate to say we depend upon the addresses not being listed in the page tables at all. We'd be OK with mapped but not accessed, if it weren't for processor speculation. There's no "no access" setting within the existing MTRR definitions, though there may be a setting that would rein in processor speculation enough to make due.
The uncached setting and the write-combining settings that are used for I/O are required to disable speculation for any regions so marked. Any reads or writes to a memory mapped I/O region can result in hardware with processing it as a command. Which as I understand it is exactly the problem with UV systems.
Frankly not mapping an I/O region (in an identity mapped page table) instead of properly mapping it as it would need to be mapped for performing I/O seems like a bit of a bug.
If you had enough MTRRs more defining the page tables to be precisely what is necessary would be simply an exercise in reducing kernel performance, because it is more efficient in both page table size, and in TLB usage to use 1GB pages instead of whatever smaller pages you have to use for oddball regions.
For systems without enough MTRRs the small performance hit in paging performance is the necessary trade off.
At least that is my perspective. Does that make sense?
I think I'm begining to get your perspective. From your point of view, is kexec failing with "nogbpages" set a bug? My point of view is it likely is. I think your view would say it isn't?
I would say it is a bug.
Part of the bug is someone yet again taking something simple that kexec is doing and reworking it to use generic code, then changing the generic code to do something different from what kexec needs and then being surprised that kexec stops working.
The interface kexec wants to provide to whatever is being loaded is not having to think about page tables until that software is up far enough to enable their own page tables.
People being clever and enabling just enough pages in the page tables to work based upon the results of some buggy (they are always buggy some are just less so than others) boot up firmware is where I get concerned.
Said another way the point is to build an identity mapped page table. Skipping some parts of the physical<->virtual identity because we seem to think no one will use it is likely a bug.
I really don't see any point in putting holes in such a page table for any address below the highest address that is good for something. Given that on some systems the MTRRs are insufficient to do there job it definitely makes sense to not enable caching on areas that we don't think are memory.
Eric