Hi Linus,
The following changes since commit 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0:
Linux 6.10-rc1 (2024-05-26 15:20:12 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
for you to fetch changes up to 3ac36aa7307363b7247ccb6f6a804e11496b2b36:
x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node() (2024-06-06 22:20:39 +0300)
---------------------------------------------------------------- Jan Beulich (2): memblock: make memblock_set_node() also warn about use of MAX_NUMNODES x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node()
arch/x86/mm/numa.c | 6 +++--- mm/memblock.c | 4 ++++ 2 files changed, 7 insertions(+), 3 deletions(-)
On Thu, 13 Jun 2024 at 07:11, Mike Rapoport rppt@kernel.org wrote:
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
What's going on? This is the second pull request recently that doesn't actually mention where to pull from.
I can do a "git ls-remote", and I see that you have a tag called "fixes-2024-06-13" that then points to the commit you mention:
for you to fetch changes up to 3ac36aa7307363b7247ccb6f6a804e11496b2b36:
but that tag name isn't actually in the pull request.
Is there some broken scripting that people have started using (or have been using for a while and was recently broken)?
Linus
On Thu, 13 Jun 2024 at 10:09, Linus Torvalds torvalds@linux-foundation.org wrote:
Is there some broken scripting that people have started using (or have been using for a while and was recently broken)?
... and then when I actually pull the code, I note that the problem where it checked _one_ bogus value has just been replaced with checking _another_ bogus value.
Christ.
What if people use a node ID that is simply outside the range entirely, instead of one of those special node IDs?
And now for memblock_set_node() you should apparently use NUMA_NO_NODE to not get a warning, but for memblock_set_region_node() apparently the right random constant to use is MAX_NUMNODES.
Does *any* of this make sense? No.
How about instead of having two random constants - and not having any range checking that I see - just have *one* random constant for "I have no range", call that NUMA_NO_NODE, and then have a simple helper for "do I have a valid range", and make that be
static inline bool numa_valid_node(int nid) { return (unsigned int)nid < MAX_NUMNODES; }
or something like that? Notice that now *all* of
- NUMA_NO_NODE (explicitly no node)
- MAX_NUMNODES (randomly used no node)
- out of range node (who knows wth firmware tables do?)
will get the same result from that "numa_valid_node()" function.
And at that point you don't need to care, you don't need to warn, and you don't need to have these insane rules where "sometimes you *HAVE* to use NUMA_NO_NODE, or we warn, in other cases MAX_NUMNODES is the thing".
Please? IOW, instead of adding a warning for fragile code, then change some caller to follow the new rules, JUST FIX THE STUPID FRAGILITY!
Or hey, just do
#define NUMA_NO_NODE MAX_NUMNODES
and have two names for the *same* constant, instead fo having two different constants with strange semantic differences that seem to make no sense and where the memblock code itself seems to go back-and-forth on it in different contexts.
Linus
On 13.06.2024 19:38, Linus Torvalds wrote:
On Thu, 13 Jun 2024 at 10:09, Linus Torvalds torvalds@linux-foundation.org wrote:
Is there some broken scripting that people have started using (or have been using for a while and was recently broken)?
... and then when I actually pull the code, I note that the problem where it checked _one_ bogus value has just been replaced with checking _another_ bogus value.
Christ.
What if people use a node ID that is simply outside the range entirely, instead of one of those special node IDs?
And now for memblock_set_node() you should apparently use NUMA_NO_NODE to not get a warning, but for memblock_set_region_node() apparently the right random constant to use is MAX_NUMNODES.
Does *any* of this make sense? No.
How about instead of having two random constants - and not having any range checking that I see - just have *one* random constant for "I have no range", call that NUMA_NO_NODE,
Just to mention it - my understanding is that this is an ongoing process heading in this very direction. I'm not an mm person at all, so I can't tell why the conversion wasn't done / can't be done all in one go.
Jan
and then have a simple helper for "do I have a valid range", and make that be
static inline bool numa_valid_node(int nid) { return (unsigned int)nid < MAX_NUMNODES; }
or something like that? Notice that now *all* of
NUMA_NO_NODE (explicitly no node)
MAX_NUMNODES (randomly used no node)
out of range node (who knows wth firmware tables do?)
will get the same result from that "numa_valid_node()" function.
And at that point you don't need to care, you don't need to warn, and you don't need to have these insane rules where "sometimes you *HAVE* to use NUMA_NO_NODE, or we warn, in other cases MAX_NUMNODES is the thing".
Please? IOW, instead of adding a warning for fragile code, then change some caller to follow the new rules, JUST FIX THE STUPID FRAGILITY!
Or hey, just do
#define NUMA_NO_NODE MAX_NUMNODES
and have two names for the *same* constant, instead fo having two different constants with strange semantic differences that seem to make no sense and where the memblock code itself seems to go back-and-forth on it in different contexts.
Linus
On Fri, Jun 14, 2024 at 08:01:33AM +0200, Jan Beulich wrote:
On 13.06.2024 19:38, Linus Torvalds wrote:
On Thu, 13 Jun 2024 at 10:09, Linus Torvalds torvalds@linux-foundation.org wrote:
Is there some broken scripting that people have started using (or have been using for a while and was recently broken)?
... and then when I actually pull the code, I note that the problem where it checked _one_ bogus value has just been replaced with checking _another_ bogus value.
Christ.
What if people use a node ID that is simply outside the range entirely, instead of one of those special node IDs?
And now for memblock_set_node() you should apparently use NUMA_NO_NODE to not get a warning, but for memblock_set_region_node() apparently the right random constant to use is MAX_NUMNODES.
Does *any* of this make sense? No.
How about instead of having two random constants - and not having any range checking that I see - just have *one* random constant for "I have no range", call that NUMA_NO_NODE,
Just to mention it - my understanding is that this is an ongoing process heading in this very direction. I'm not an mm person at all, so I can't tell why the conversion wasn't done / can't be done all in one go.
Nah, it's an historical mess and my oversight.
Jan
On Thu, Jun 13, 2024 at 10:38:28AM -0700, Linus Torvalds wrote:
On Thu, 13 Jun 2024 at 10:09, Linus Torvalds torvalds@linux-foundation.org wrote:
Is there some broken scripting that people have started using (or have been using for a while and was recently broken)?
... and then when I actually pull the code, I note that the problem where it checked _one_ bogus value has just been replaced with checking _another_ bogus value.
Christ.
What if people use a node ID that is simply outside the range entirely, instead of one of those special node IDs?
And now for memblock_set_node() you should apparently use NUMA_NO_NODE to not get a warning, but for memblock_set_region_node() apparently the right random constant to use is MAX_NUMNODES.
Does *any* of this make sense? No.
How about instead of having two random constants - and not having any range checking that I see - just have *one* random constant for "I have no range", call that NUMA_NO_NODE, and then have a simple helper for "do I have a valid range", and make that be
static inline bool numa_valid_node(int nid) { return (unsigned int)nid < MAX_NUMNODES; }
or something like that? Notice that now *all* of
NUMA_NO_NODE (explicitly no node)
MAX_NUMNODES (randomly used no node)
out of range node (who knows wth firmware tables do?)
will get the same result from that "numa_valid_node()" function.
And at that point you don't need to care, you don't need to warn, and you don't need to have these insane rules where "sometimes you *HAVE* to use NUMA_NO_NODE, or we warn, in other cases MAX_NUMNODES is the thing".
Please? IOW, instead of adding a warning for fragile code, then change some caller to follow the new rules, JUST FIX THE STUPID FRAGILITY!
Or hey, just do
#define NUMA_NO_NODE MAX_NUMNODES
and have two names for the *same* constant, instead fo having two different constants with strange semantic differences that seem to make no sense and where the memblock code itself seems to go back-and-forth on it in different contexts.
A single constant is likely to backfire because I remember seeing checks like 'if (nid < 0)' so redefining NUMA_NO_NODE will require auditing all those.
But a helper function works great. I could only lightly test it as I don't have a fleet of machines with variety of memory layouts, so I'm planning to push it into -next early next week (with subject replaced by a more informative one)
From 319eddd74b372cae840782c7d53832ab30533a6b Mon Sep 17 00:00:00 2001 From: "Mike Rapoport (IBM)" rppt@kernel.org Date: Fri, 14 Jun 2024 11:05:43 +0300 Subject: [PATCH] memblock: FIX THE STUPID FRAGILITY
Introduce numa_valid_node(nid) that verifies that nid is a valid node ID and use that instead of comparing nid parameter with either NUMA_NO_NODE or MAX_NUMNODES.
This makes the checks for valid node IDs consistent and more robust and allows to get rid of multiple WARNings.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org --- include/linux/numa.h | 5 +++++ mm/memblock.c | 28 +++++++--------------------- 2 files changed, 12 insertions(+), 21 deletions(-)
diff --git a/include/linux/numa.h b/include/linux/numa.h index 1d43371fafd2..eb19503604fe 100644 --- a/include/linux/numa.h +++ b/include/linux/numa.h @@ -15,6 +15,11 @@ #define NUMA_NO_NODE (-1) #define NUMA_NO_MEMBLK (-1)
+static inline bool numa_valid_node(int nid) +{ + return nid >= 0 && nid < MAX_NUMNODES; +} + /* optionally keep NUMA memory info available post init */ #ifdef CONFIG_NUMA_KEEP_MEMINFO #define __initdata_or_meminfo diff --git a/mm/memblock.c b/mm/memblock.c index 08e9806b1cf9..e81fb68f7f88 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -754,7 +754,7 @@ bool __init_memblock memblock_validate_numa_coverage(unsigned long threshold_byt
/* calculate lose page */ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { - if (nid == NUMA_NO_NODE) + if (!numa_valid_node(nid)) nr_pages += end_pfn - start_pfn; }
@@ -1061,7 +1061,7 @@ static bool should_skip_region(struct memblock_type *type, return false;
/* only memory regions are associated with nodes, check it */ - if (nid != NUMA_NO_NODE && nid != m_nid) + if (numa_valid_node(nid) && nid != m_nid) return true;
/* skip hotpluggable memory regions if needed */ @@ -1118,10 +1118,6 @@ void __next_mem_range(u64 *idx, int nid, enum memblock_flags flags, int idx_a = *idx & 0xffffffff; int idx_b = *idx >> 32;
- if (WARN_ONCE(nid == MAX_NUMNODES, - "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - for (; idx_a < type_a->cnt; idx_a++) { struct memblock_region *m = &type_a->regions[idx_a];
@@ -1215,9 +1211,6 @@ void __init_memblock __next_mem_range_rev(u64 *idx, int nid, int idx_a = *idx & 0xffffffff; int idx_b = *idx >> 32;
- if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - if (*idx == (u64)ULLONG_MAX) { idx_a = type_a->cnt - 1; if (type_b != NULL) @@ -1303,7 +1296,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
if (PFN_UP(r->base) >= PFN_DOWN(r->base + r->size)) continue; - if (nid == MAX_NUMNODES || nid == r_nid) + if (!numa_valid_node(nid) || nid == r_nid) break; } if (*idx >= type->cnt) { @@ -1339,10 +1332,6 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, int start_rgn, end_rgn; int i, ret;
- if (WARN_ONCE(nid == MAX_NUMNODES, - "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret; @@ -1452,9 +1441,6 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, enum memblock_flags flags = choose_memblock_flags(); phys_addr_t found;
- if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - if (!align) { /* Can't use WARNs this early in boot on powerpc */ dump_stack(); @@ -1467,7 +1453,7 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, if (found && !memblock_reserve(found, size)) goto done;
- if (nid != NUMA_NO_NODE && !exact_nid) { + if (numa_valid_node(nid) && !exact_nid) { found = memblock_find_in_range_node(size, align, start, end, NUMA_NO_NODE, flags); @@ -1987,7 +1973,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type) end = base + size - 1; flags = rgn->flags; #ifdef CONFIG_NUMA - if (memblock_get_region_node(rgn) != MAX_NUMNODES) + if (numa_valid_node(memblock_get_region_node(rgn))) snprintf(nid_buf, sizeof(nid_buf), " on node %d", memblock_get_region_node(rgn)); #endif @@ -2181,7 +2167,7 @@ static void __init memmap_init_reserved_pages(void) start = region->base; end = start + region->size;
- if (nid == NUMA_NO_NODE || nid >= MAX_NUMNODES) + if (!numa_valid_node(nid)) nid = early_pfn_to_nid(PFN_DOWN(start));
reserve_bootmem_region(start, end, nid); @@ -2272,7 +2258,7 @@ static int memblock_debug_show(struct seq_file *m, void *private)
seq_printf(m, "%4d: ", i); seq_printf(m, "%pa..%pa ", ®->base, &end); - if (nid != MAX_NUMNODES) + if (numa_valid_node(nid)) seq_printf(m, "%4d ", nid); else seq_printf(m, "%4c ", 'x');
On Fri, 14 Jun 2024 at 01:20, Mike Rapoport rppt@kernel.org wrote:
A single constant is likely to backfire because I remember seeing checks like 'if (nid < 0)' so redefining NUMA_NO_NODE will require auditing all those.
Yeah, fair enough.
But a helper function works great.
Thanks, that patch looks like a nice improvement to me.
Linus
The pull request you sent on Thu, 13 Jun 2024 17:09:36 +0300:
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock refs/heads/master
has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/3572597ca844f625a3c9ba629ed0872b64c16179
Thank you!
linux-stable-mirror@lists.linaro.org