RE: [PATCH v5 01/15] genpt: Generic Page Table base API

19 Sep 2025

      ...
From: Jason Gunthorpe jgg@nvidia.com
Sent: Friday, September 19, 2025 2:07 AM
On Thu, Sep 18, 2025 at 06:49:08AM +0000, Tian, Kevin wrote:
...
...
This is enough to implement the 8 initial format variations with all of
their features:

Entries comprised of contiguous blocks of IO PTEs for larger page
sizes (AMDv1, ARMv8)
Multi-level tables, up to 6 levels. Runtime selected top level
Runtime variable table level size (ARM's concatenated tables)
Expandable top level (AMDv1)

any more context about this one? how is it different from the earlier
"runtime selected top level"?
How about:

The size of the top table level can be selected at runtime (ARM's
concatenated tables)
The number of levels in the table can optionally increase dynamically
during map (AMDv1)

clearer.
...
...
...

item/entry_size

The number of bytes of VA the table translates for.

If the item is a table entry then the next table covers

this size. If the entry is an output address then the

s/is/translates/
Don't follow?
entry is not an address. So I meant:
"If the entry translates to an output address"
...
...
...
+/*

PT_WARN_ON is used for invariants that the kunit should be checking

can't

happen.

*/

+#if IS_ENABLED(CONFIG_DEBUG_GENERIC_PT)
+#define PT_WARN_ON WARN_ON
+#else
+static inline bool PT_WARN_ON(bool condition)
+{

return false;

+}
+#endif
Then call it PT_DBG_WARN_ON() to be more explicit?
Ah, I'd rather leave it..
...
btw looks there is no plain WARN_ON() used in generic-pt. Just be curious
about the rationale behind. Is it a new trend to contain all warnings under
a debug option?
It is sort of like VM_WARN_ON(), the places that got put are largely
performance path so you don't want it enabled unless doing debugging.
Generally the idea is to use PT_WARN_ON() on performance path cases
only, and leave normal stuff to normal WARN_ON.
Good to know
...
...
...

+/* If not supplied by the format then contiguous pages are not
supported */
...
...
+#ifndef pt_entry_num_contig_lg2
+static inline unsigned int pt_entry_num_contig_lg2(const struct pt_state
*pts)
+{

return ilog2(1);

+}

+static inline unsigned short pt_contig_count_lg2(const struct pt_state
*pts)
...
...
+{

return ilog2(1);

+}
what is the difference between above two helpers?
Oh, pt_contig_count_lg2 didn't get kdocs because they are internal
helpers to build other functions..
Like this:
/*

If not supplied by the format then contiguous pages are not supported.

If contiguous pages are supported then the format must also provide
pt_contig_count_lg2() if it supports a single contiguous size per level,
or pt_possible_sizes() if it supports multiple sizes per level.

could be simplified to require the format to always support pt_possible_sizes()
if contiguous sizes are supported, no matter being a single size or multiple.
...
*/
#ifndef pt_entry_num_contig_lg2
static inline unsigned int pt_entry_num_contig_lg2(const struct pt_state *pts)
{
   return ilog2(1);
}
/*

Return the number of contiguous OA items forming an entry at this table

level
 */
static inline unsigned short pt_contig_count_lg2(const struct pt_state *pts)
{
   return ilog2(1);
}
#endif
...
It's currently not implemented by any driver so will have the default version
returning 0. and it is only used by default pt_possible_sizes(), which then
returns only one page size accordingly.
ARM and RISCV use it. AMD is the only format that support more than
one size of contiguous per level.
hmm I didn't find ARM/RISCV defining pt_contig_count_lg2(). So in all cases
it's the default version returning ilog2(1). Then what's the point of keeping
it instead of directly using ilog2(1) in default pt_possible_sizes? sorry that 
I still didn't connect the dots here.
...
...
...

+/**

pt_item_fully_covered() - Check if the item or entry is entirely

contained
...
...

                      within pts->range

when using pts it's more accurate to call it pt_entry_fully_covered()
Not so much related to PTS, as pts could be either, but it makes more
sense to refer to table poitner as an entry than does it to refer to a
contiguous entry an item.
make sense.
...
...
...

PT_FEAT_FLUSH_RANGE,
/**
* @PT_FEAT_FLUSH_RANGE_NO_GAPS: Like PT_FEAT_FLUSH_RANGE

except that

* the optimization objective is to only flush IOVA that has been

* changed. This mode is suitable for cases like hypervisor shadowing

* where flushing unchanged ranges may cause the hypervisor to

reparse

* significant amount of page table.

*/

PT_FEAT_FLUSH_RANGE_NO_GAPS,

FLUSH_RANGE and FLUSH_RANGE_NO_GAPS are mutually exclusive but
one format must select one? then we could just keep one flag (NO_GAP)
then feature off means FLUSH_RANGE.
I think at least one or two more flushing modes will be needed, and
they would be mutually exclusive. It is setup with this one hot
encoding because of how the pt_feature function compile optimization
works.
clear now.

2025

2024

2023

2022

2021

2020

2019

2018

2017

RE: [PATCH v5 01/15] genpt: Generic Page Table base API