Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;
It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Cc: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Fixes: 1a4e564b7db9 ("resource: add resource_size()") Signed-off-by: Christian Marangi ansuelsmth@gmail.com --- include/linux/ioport.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 9afa30f9346f..1b8ce62255db 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -288,6 +288,9 @@ static inline void resource_set_range(struct resource *res,
static inline resource_size_t resource_size(const struct resource *res) { + if (!res->start && !res->end) + return 0; + return res->end - res->start + 1; } static inline unsigned long resource_type(const struct resource *res)
On Sun, Dec 07, 2025 at 10:53:48PM +0100, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Good catch!
Now, let's unveil which drivers rely on "broken" behaviour...
...
static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;
I think this breaks or might brake some of the drivers that rely on the proper calculation. If you supply the start and end for the same (if it's not 0), you will get 1 and it's _correct_ result (surprise surprise). One of the thing that may be directly affected (and regress) is the amount of IRQs calculation (which on some platforms may start from 0). However, in practice I think it's none nowadays in the upstream kernel.
return res->end - res->start + 1; }
That said, unfortunately, I think, you want to fix drivers one-by-one and this patch is incorrect as it brings inconsistency to the logic (1 occupied address whatever unit it has may still be valid resource).
Also a good start is to add test cases and add/update documentation.
On Mon, Dec 08, 2025 at 01:12:03AM +0200, Andy Shevchenko wrote:
On Sun, Dec 07, 2025 at 10:53:48PM +0100, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Good catch!
Now, let's unveil which drivers rely on "broken" behaviour...
...
static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;I think this breaks or might brake some of the drivers that rely on the proper calculation. If you supply the start and end for the same (if it's not 0), you will get 1 and it's _correct_ result (surprise surprise). One of the thing that may be directly affected (and regress) is the amount of IRQs calculation (which on some platforms may start from 0). However, in practice I think it's none nowadays in the upstream kernel.
One common usage of this is with address size. So if start and end is the same, then it's ok to have size 1?
return res->end - res->start + 1; }
That said, unfortunately, I think, you want to fix drivers one-by-one and this patch is incorrect as it brings inconsistency to the logic (1 occupied address whatever unit it has may still be valid resource).
Yep but probably never aligned? I don't think there is an arch in the world that is aligned to 1 byte?
Also a good start is to add test cases and add/update documentation.
I hoped this was simple enough to have the condition. The more articulate and safe change might be to: 1. rename this to __resource_size 2. rename every entry of resource_size to __resource_size 3. introduce a new resource_size commented and with the check 4. Use the new helper where it's actually needed?
From my search there are various place where the condition is like:
if (resource_size(&res)) ...
And this condition doesn't make any sense since it's always true (I highly suspect these case all fall in what I described)
For sure this needs to be discussed and we need to gather more info.
-- With Best Regards, Andy Shevchenko
On Mon, Dec 08, 2025 at 01:12:08AM +0200, Andy Shevchenko wrote:
On Sun, Dec 07, 2025 at 10:53:48PM +0100, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Actually this not true. Obviously if the end == start - 1, it will return 0. So, you really need _carefully_ check users one-by-one and see how resource is filled, before judging the test. It might or might not be broken. Each case is individual, but the observation you made is quite valuable, thanks!
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Good catch!
Now, let's unveil which drivers rely on "broken" behaviour...
...
static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;I think this breaks or might brake some of the drivers that rely on the proper calculation. If you supply the start and end for the same (if it's not 0), you will get 1 and it's _correct_ result (surprise surprise). One of the thing that may be directly affected (and regress) is the amount of IRQs calculation (which on some platforms may start from 0). However, in practice I think it's none nowadays in the upstream kernel.
return res->end - res->start + 1; }
That said, unfortunately, I think, you want to fix drivers one-by-one and this patch is incorrect as it brings inconsistency to the logic (1 occupied address whatever unit it has may still be valid resource).
Also a good start is to add test cases and add/update documentation.
On Mon, Dec 08, 2025 at 01:23:37AM +0200, Andy Shevchenko wrote:
On Mon, Dec 08, 2025 at 01:12:08AM +0200, Andy Shevchenko wrote:
On Sun, Dec 07, 2025 at 10:53:48PM +0100, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Actually this not true. Obviously if the end == start - 1, it will return 0. So, you really need _carefully_ check users one-by-one and see how resource is filled, before judging the test. It might or might not be broken. Each case is individual, but the observation you made is quite valuable, thanks!
Yes sure there are case where it can return zero but are there real world scenario like that in the context of resource_size for PCI or resouce for MMIO?
Again the idea of this patch was to start searching for error instead of simply fixing ath11k, I'm pretty sure there are other case that are currently working by luck.
Another idea might be to introduce a new helper and add all kind of checks to understand if the resource we are testing is all zero.
Something like resource_is_zero() that checks if start end and flags are all zero? (and fix all the case where the helper might be used in a wrong way?)
Or maybe we can change the condition of this to:
if (!res.flags && !res.start && !res.end) return 0;
Just putting some ideas on what would be the proper solution to the problem without having to analyze all the 990 case where the helper is used ehehehhe
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Good catch!
Now, let's unveil which drivers rely on "broken" behaviour...
...
static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;I think this breaks or might brake some of the drivers that rely on the proper calculation. If you supply the start and end for the same (if it's not 0), you will get 1 and it's _correct_ result (surprise surprise). One of the thing that may be directly affected (and regress) is the amount of IRQs calculation (which on some platforms may start from 0). However, in practice I think it's none nowadays in the upstream kernel.
return res->end - res->start + 1; }
That said, unfortunately, I think, you want to fix drivers one-by-one and this patch is incorrect as it brings inconsistency to the logic (1 occupied address whatever unit it has may still be valid resource).
Also a good start is to add test cases and add/update documentation.
-- With Best Regards, Andy Shevchenko
On Mon, Dec 08, 2025 at 12:20:16AM +0100, Christian Marangi wrote:
On Mon, Dec 08, 2025 at 01:12:03AM +0200, Andy Shevchenko wrote:
On Sun, Dec 07, 2025 at 10:53:48PM +0100, Christian Marangi wrote:
...
static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;I think this breaks or might brake some of the drivers that rely on the proper calculation. If you supply the start and end for the same (if it's not 0), you will get 1 and it's _correct_ result (surprise surprise). One of the thing that may be directly affected (and regress) is the amount of IRQs calculation (which on some platforms may start from 0). However, in practice I think it's none nowadays in the upstream kernel.
One common usage of this is with address size. So if start and end is the same, then it's ok to have size 1?
First of all, resources is not _only_ about the addresses. It may be just... a resource. Whatever it means.
So, second question here is that do we use open or closed interval? I always have an impression that resource model (or in general model with start/end, instead of start/size) uses closed intervals.
Third note, the resources is not only start and end, they also have flags, and this has to be considered when retrieving the content of the resource.
So, in my world, the 1 _is_ the correct result for resource [start .. end], where start = x, and end = x.
return res->end - res->start + 1; }
That said, unfortunately, I think, you want to fix drivers one-by-one and this patch is incorrect as it brings inconsistency to the logic (1 occupied address whatever unit it has may still be valid resource).
Yep but probably never aligned? I don't think there is an arch in the world that is aligned to 1 byte?
I believe you stick to much to the resource == address, no, this is wrong. We have other resources that are not addresses, or we have resource where 0 is _valid_ start point (IO ports, IIRC, is the case, IRQs — of course!).
Also a good start is to add test cases and add/update documentation.
I hoped this was simple enough to have the condition.
I think it needs to be checked first with the actual users.
The more articulate and safe change might be to:
- rename this to __resource_size
- rename every entry of resource_size to __resource_size
- introduce a new resource_size commented and with the check
- Use the new helper where it's actually needed?
Maybe it's simpler and just (almost) nothing should be done? Check what is there, in the resources, before the checks and what flags are being checked before those size checks.
From my search there are various place where the condition is like:
if (resource_size(&res)) ...
And this condition doesn't make any sense since it's always true (I highly suspect these case all fall in what I described)
Not always, I already pointed out that it's not true.
For sure this needs to be discussed and we need to gather more info.
On Mon, Dec 08, 2025 at 12:33:03AM +0100, Christian Marangi wrote:
On Mon, Dec 08, 2025 at 01:23:37AM +0200, Andy Shevchenko wrote:
On Mon, Dec 08, 2025 at 01:12:08AM +0200, Andy Shevchenko wrote:
On Sun, Dec 07, 2025 at 10:53:48PM +0100, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Actually this not true. Obviously if the end == start - 1, it will return 0. So, you really need _carefully_ check users one-by-one and see how resource is filled, before judging the test. It might or might not be broken. Each case is individual, but the observation you made is quite valuable, thanks!
Yes sure there are case where it can return zero but are there real world scenario like that in the context of resource_size for PCI or resouce for MMIO?
I believe PCI by design won't enumerate the device that has 0 size. Either it will be cut at the level of some checks before even considering filing the resources, or if resource_size() returns 1, it won't find proper window or window will be taken at bare minimum of a size of the page (usually 4k). I'm not sure that PCI code is affected by this, but it worth to check.
Again the idea of this patch was to start searching for error instead of simply fixing ath11k, I'm pretty sure there are other case that are currently working by luck.
True.
Another idea might be to introduce a new helper and add all kind of checks to understand if the resource we are testing is all zero.
I think the better choice is to check all places where resource is assigned and convert that to a helper when the size can be or is 0. Definitely it's a lot of code to be audited.
But having something like
resource_set_size(res, 0) // note, this API is already in upstream or resource_set_range(res, start, 0)
instead of direct assignment of start and end is much simpler approach.
Something like resource_is_zero() that checks if start end and flags are all zero? (and fix all the case where the helper might be used in a wrong way?)
Or maybe we can change the condition of this to:
if (!res.flags && !res.start && !res.end) return 0;
Just putting some ideas on what would be the proper solution to the problem without having to analyze all the 990 case where the helper is used ehehehhe
Like I said, the best and really helpful start is - test cases for these corner cases - documentation audit and update
When we have documentation in place and test cases to show how it all works, we may point out to the bugs or incorrect assumptions made in the code.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
Given the confusion, also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Good catch!
Now, let's unveil which drivers rely on "broken" behaviour...
...
static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;I think this breaks or might brake some of the drivers that rely on the proper calculation. If you supply the start and end for the same (if it's not 0), you will get 1 and it's _correct_ result (surprise surprise). One of the thing that may be directly affected (and regress) is the amount of IRQs calculation (which on some platforms may start from 0). However, in practice I think it's none nowadays in the upstream kernel.
return res->end - res->start + 1; }
That said, unfortunately, I think, you want to fix drivers one-by-one and this patch is incorrect as it brings inconsistency to the logic (1 occupied address whatever unit it has may still be valid resource).
Also a good start is to add test cases and add/update documentation.
On Mon, Dec 08, 2025 at 01:57:37AM +0200, Andy Shevchenko wrote:
On Mon, Dec 08, 2025 at 12:33:03AM +0100, Christian Marangi wrote:
...
I think the better choice is to check all places where resource is assigned and convert that to a helper when the size can be or is 0. Definitely it's a lot of code to be audited.
But having something like
resource_set_size(res, 0) // note, this API is already in upstream or resource_set_range(res, start, 0)
instead of direct assignment of start and end is much simpler approach.
Just run
git grep -n -w resource_set_range
and you see that even OF uses it in once case already. And PCI core is full of the calls, that's why I believe PCI is not affected as it does the correct thing to begin with.
On Sun, 7 Dec 2025, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
So what's the actual resource that causes this ath11 regression? You seem possess that knowledge as you knew to create this patch.
of_reserved_mem_region_to_resource() does use resource_set_range() so how did the end address become 0? Is there perhaps some other bug being covered up with this patch to resource_size()?
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
???
resource_size() does return 0 when ->start = 0 and ->end = - 1 which is the correctly setup of a zero-sized resource (when flags are non-zero).
Given the confusion,
There's lots of resource setup code which does set resource end address properly by applying -1 to it.
also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;
This place does not tell you what ->end is expected to be set? Where did you infer that information?
I suspect this code would want to do resource_is_assigned() (which doesn't exists yet but would check only res->parent != NULL) or base the check on IORESOURCE_UNSET|IORESOURCE_DISABLED.
Often using resourse_size() (or a few other address based checks) in drivers is misguided, what drivers are more interested in is if the resource is valid and/or properly in the resource tree (assigned by the PCI core which implies it's valid too and has a non-zero size), not so much about it's size. As you can see, size itself not even used here at all, that is, this place never was interested in size but uses it as a proxy for something else (like also many other drivers do)!
It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
Who creates such resources?
If flags are non-zero, those places should be fixed (I'm currently fixing one case from drivers/pci/probe.c thanks to you bringing this up :-)) but I think the number of places to fix are fewer than you seem to think. I read though all end assignments in PCI core and found only this single(!) problem there.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Cc: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Fixes: 1a4e564b7db9 ("resource: add resource_size()") Signed-off-by: Christian Marangi ansuelsmth@gmail.com
include/linux/ioport.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 9afa30f9346f..1b8ce62255db 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -288,6 +288,9 @@ static inline void resource_set_range(struct resource *res, static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;
This looks wrong to me.
Lets try to fix the resource setup side instead. The real problem is in any code that sets ->end to zero when it wanted to set resource size to zero, which are not the same thing. To make it simpler (no need to manually apply -1 to the end address) and less error prone, we have helpers resource_set_range/size() and DEFINE_RES() that handle applying -1 correctly.
- return res->end - res->start + 1;
} static inline unsigned long resource_type(const struct resource *res)
Taking this suggestion from your other email:
if (!res.flags && !res.start && !res.end) return 0;
IMO, the caller shouldn't be calling resource_size() at all when !res.flags as that indicates the caller is confused and doesn't really know what it is even checking for. If a driver does even know if the resource is valid (has a valid type) or not, it should be checking that _first_ (it might needs the size for valid resources but it should IMO still check validity of the resource first).
Also, as mentioned above, some/many drivers are not fundamentally interested in validity itself but whether the resource is assigned to the resource tree.
For valid resources, this extra check is entirely unnecessary.
I'd be supportive of adding
WARN_ON_ONCE(!res.flags)
...here but that will be likely a bit too noisy without first auditting many places (from stable people's perspective a triggering WARN_ON() => DoS). But that's IMO the direction kernel code should be heading.
On Mon, Dec 08, 2025 at 03:29:01PM +0200, Ilpo Järvinen wrote:
On Sun, 7 Dec 2025, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
So what's the actual resource that causes this ath11 regression? You seem possess that knowledge as you knew to create this patch.
I'm not gatekeeping the information. Just trying to get involved as much people as possible in checking this.
This helper is used in generic OF and PCI code and since there is the combo 2008 code + no comments, then it's sensible to make sure no confusion was applied across the year.
of_reserved_mem_region_to_resource() does use resource_set_range() so how did the end address become 0? Is there perhaps some other bug being covered up with this patch to resource_size()?
This IS EXACTLY the problem HERE. There is the assumption that resource_size() MUST be used ALWAYS after a resource struct is used.
But this is not alywas the case... nothing stops to call resource_size() on a a local
struct resource res = {}
and assume resource_size() returning 0 (as struct resource is all zero and assuming logically the size to be zero) (this assumption has been already proved to be wrong by the previous message but it's very common to assume that an helper called on an all zero struct returns 0 if the subject is about size or range of address)
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
???
resource_size() does return 0 when ->start = 0 and ->end = - 1 which is the correctly setup of a zero-sized resource (when flags are non-zero).
Again IF the resource struct is correctly init, in the case of kmalloc-ed stuff or local variables init to zero, then you quickly trigger the bug as both start and end will be zero.
Also I honestly found some difficulties finding where end is set to -1.
I actually found case in OF code where both start and end are set to -1 resulting.
Can you point me where is this init code?
Given the confusion,
There's lots of resource setup code which does set resource end address properly by applying -1 to it.
also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;This place does not tell you what ->end is expected to be set? Where did you infer that information?
I suspect this code would want to do resource_is_assigned() (which doesn't exists yet but would check only res->parent != NULL) or base the check on IORESOURCE_UNSET|IORESOURCE_DISABLED.
Often using resourse_size() (or a few other address based checks) in drivers is misguided, what drivers are more interested in is if the resource is valid and/or properly in the resource tree (assigned by the PCI core which implies it's valid too and has a non-zero size), not so much about it's size. As you can see, size itself not even used here at all, that is, this place never was interested in size but uses it as a proxy for something else (like also many other drivers do)!
Yes I agree that resource_size() is not the correct way to check if a resource is valid but it seems to be used for this task in some code.
It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
Who creates such resources?
It's more a matter of using resource_size() in place where struct resource is optionally used.
If flags are non-zero, those places should be fixed (I'm currently fixing one case from drivers/pci/probe.c thanks to you bringing this up :-)) but I think the number of places to fix are fewer than you seem to think. I read though all end assignments in PCI core and found only this single(!) problem there.
Thanks a lot for checking the PCI code, that is one of my concern subsystem where this might be problematic. The other is the OF code.
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Cc: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Fixes: 1a4e564b7db9 ("resource: add resource_size()") Signed-off-by: Christian Marangi ansuelsmth@gmail.com
include/linux/ioport.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 9afa30f9346f..1b8ce62255db 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -288,6 +288,9 @@ static inline void resource_set_range(struct resource *res, static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;This looks wrong to me.
Yes it's indded wrong as I didn't think of case where start and end can match and correctly return a size of 1.
Lets try to fix the resource setup side instead. The real problem is in any code that sets ->end to zero when it wanted to set resource size to zero, which are not the same thing. To make it simpler (no need to manually apply -1 to the end address) and less error prone, we have helpers resource_set_range/size() and DEFINE_RES() that handle applying -1 correctly.
OK! Thanks for the pointer so there is a DEFINE_RES() for local variables!
- return res->end - res->start + 1;
} static inline unsigned long resource_type(const struct resource *res)
Taking this suggestion from your other email:
if (!res.flags && !res.start && !res.end) return 0;
IMO, the caller shouldn't be calling resource_size() at all when !res.flags as that indicates the caller is confused and doesn't really know what it is even checking for. If a driver does even know if the resource is valid (has a valid type) or not, it should be checking that _first_ (it might needs the size for valid resources but it should IMO still check validity of the resource first).
Also, as mentioned above, some/many drivers are not fundamentally interested in validity itself but whether the resource is assigned to the resource tree.
For valid resources, this extra check is entirely unnecessary.
I'd be supportive of adding
WARN_ON_ONCE(!res.flags)
...here but that will be likely a bit too noisy without first auditting many places (from stable people's perspective a triggering WARN_ON() => DoS). But that's IMO the direction kernel code should be heading.
I'm more than happy to send a v2 of this fixing ath11k and adding the WARN_ON_ONCE with res.flags.
Seems the only correct way to track this down and improve this in the next kernel versions.
Also thanks to Andy and you for explaining this and reasurme me there wasn't a big bug around this helper but just some minor logical confusion.
Do you agree on the plan for v2? (we can assume the noise won't be so much as I expect only SOME driver to be affected by this problem) (I hope)
-- i.
On Mon, 8 Dec 2025, Christian Marangi wrote:
On Mon, Dec 08, 2025 at 03:29:01PM +0200, Ilpo Järvinen wrote:
On Sun, 7 Dec 2025, Christian Marangi wrote:
Commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region"") uncovered a massive problem with the usage of resource_size() helper.
The reported commit caused a regression with ath11k WiFi firmware loading and the change was just a simple replacement of duplicate code with a new helper of_reserved_mem_region_to_resource().
On reworking this, in the commit also a check for the presence of the node was replaced with resource_size(&res). This was done following the logic that if the node wasn't present then it's expected that also the resource_size is zero, mimicking the same if-else logic.
So what's the actual resource that causes this ath11 regression? You seem possess that knowledge as you knew to create this patch.
I'm not gatekeeping the information. Just trying to get involved as much people as possible in checking this.
This helper is used in generic OF and PCI code and since there is the combo 2008 code + no comments, then it's sensible to make sure no confusion was applied across the year.
of_reserved_mem_region_to_resource() does use resource_set_range() so how did the end address become 0? Is there perhaps some other bug being covered up with this patch to resource_size()?
This IS EXACTLY the problem HERE. There is the assumption that resource_size() MUST be used ALWAYS after a resource struct is used.
But this is not alywas the case... nothing stops to call resource_size() on a a local
struct resource res = {}
and assume resource_size() returning 0 (as struct resource is all zero and assuming logically the size to be zero) (this assumption has been already proved to be wrong by the previous message but it's very common to assume that an helper called on an all zero struct returns 0 if the subject is about size or range of address)
I guess you might have written this before reading the entire email as the WARN_ON_ONCE() I suggested would have caught the caller side bug(s).
The ath11k assumes res was setup by of_reserved_mem_region_to_resource() that appearently wasn't the case (a logic bug introduced in the commit 900730dc4705 ("wifi: ath: Use of_reserved_mem_region_to_resource() for "memory-region""), and then calls resource_size() without first making sure the resource is valid.
There are at least two, if not three, problems on the caller side here.
This was also the reason the regression was mostly hard to catch at first sight as the rework is correctly done given the assumption on the used helpers.
BUT this is actually not the case. On further inspection on resource_size() it was found that it NEVER actually returns 0.
Even if the resource value of start and end are 0, the return value of resource_size() will ALWAYS be 1, resulting in the broken if-else condition ALWAYS going in the first if condition.
This was simply confirmed by reading the resource_size() logic:
return res->end - res->start + 1;
???
resource_size() does return 0 when ->start = 0 and ->end = - 1 which is the correctly setup of a zero-sized resource (when flags are non-zero).
Again IF the resource struct is correctly init, in the case of kmalloc-ed stuff or local variables init to zero, then you quickly trigger the bug as both start and end will be zero.
Also I honestly found some difficulties finding where end is set to -1.
git grep -e '->end = .* - 1'
A more complex pattern or Coccinelle can find more for you if you want to find multiline or other more complex constructs.
Those mostly predate the init helpers (I did have a series to convert them to resource_set_range() but DEFINE_RES() got introduced in the meantime too so I'd have needed to adapt many of the changes so I left the series to rotten as it was always a cleanup side-project to me and I've more on my plate than I've time for currently even in the main resource assignment algorithm).
I actually found case in OF code where both start and end are set to -1 resulting.
Can you point me where is this init code?
As noted below, DEFINE_RES*(), resource_set_range() and resource_set_size() in include/linux/ioport.h are your friends here. In general, DEFINE_RES is preferred but when only addresses of a resource are adjusted dynamically, those C functions are better suited for the task.
Given the confusion,
There's lots of resource setup code which does set resource end address properly by applying -1 to it.
also other case of such usage were searched in the kernel and with great suprise it seems LOTS of place assume resource_size() should return zero in the context of the resource start and end set to 0.
Quoting for example comments in drivers/vfio/pci/vfio_pci_core.c:
/* * The PCI core shouldn't set up a resource with a * type but zero size. But there may be bugs that * cause us to do that. */ if (!resource_size(res)) goto no_mmap;This place does not tell you what ->end is expected to be set? Where did you infer that information?
I suspect this code would want to do resource_is_assigned() (which doesn't exists yet but would check only res->parent != NULL) or base the check on IORESOURCE_UNSET|IORESOURCE_DISABLED.
Often using resourse_size() (or a few other address based checks) in drivers is misguided, what drivers are more interested in is if the resource is valid and/or properly in the resource tree (assigned by the PCI core which implies it's valid too and has a non-zero size), not so much about it's size. As you can see, size itself not even used here at all, that is, this place never was interested in size but uses it as a proxy for something else (like also many other drivers do)!
Yes I agree that resource_size() is not the correct way to check if a resource is valid but it seems to be used for this task in some code.
It really seems resource_size() was tought with the assumption that resource struct was always correctly initialized before calling it and never set to zero.
But across the year this got lost and now there are lots of driver that assume resource_size() returns 0 if start and end are also 0.
Who creates such resources?
It's more a matter of using resource_size() in place where struct resource is optionally used.
Yes but it is a latent logic bug (using the resource size as the proxy for proper validity check in the logic).
Using these proxies (in general) is what makes it hard to make some resource change I'd actually want to do in the PCI core so I don't have have much love for it (I'll probably eventually have to audit them all myself :-(). As a result, we currently have the unfortunate situation that a resizable BAR that cannot be initially assigned is lost forever. The resource had to be reset on assignment failure to keep these proxies agreeing with the state of the resource, as endpoint drivers would be confused otherwise, so making space by resizing other BARs will not bring the other BAR back.
This proxy problem actually extends beyond mere resource_size(). There are also start and end address checks and pci_resource_len() checks similarly used as proxies. None of those proxies remain valid if PCI core stops to reset resource addresses and flags.
If flags are non-zero, those places should be fixed (I'm currently fixing one case from drivers/pci/probe.c thanks to you bringing this up :-)) but I think the number of places to fix are fewer than you seem to think. I read though all end assignments in PCI core and found only this single(!) problem there.
Thanks a lot for checking the PCI code, that is one of my concern subsystem where this might be problematic. The other is the OF code.
Unfortunately I'm no OF expert but at least drivers/of/ didn't have anything interesting when grepping with:
git grep -e '->end = ' -e '.end = ' -- drivers/of
To better handle this and make resource_size() returns correct value in such case, add a simple check and return 0 if both resource start and resource end are zero.
Cc: Rob Herring (Arm) robh@kernel.org Cc: stable@vger.kernel.org Fixes: 1a4e564b7db9 ("resource: add resource_size()") Signed-off-by: Christian Marangi ansuelsmth@gmail.com
include/linux/ioport.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 9afa30f9346f..1b8ce62255db 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -288,6 +288,9 @@ static inline void resource_set_range(struct resource *res, static inline resource_size_t resource_size(const struct resource *res) {
- if (!res->start && !res->end)
return 0;This looks wrong to me.
Yes it's indded wrong as I didn't think of case where start and end can match and correctly return a size of 1.
Lets try to fix the resource setup side instead. The real problem is in any code that sets ->end to zero when it wanted to set resource size to zero, which are not the same thing. To make it simpler (no need to manually apply -1 to the end address) and less error prone, we have helpers resource_set_range/size() and DEFINE_RES() that handle applying -1 correctly.
OK! Thanks for the pointer so there is a DEFINE_RES() for local variables!
- return res->end - res->start + 1;
} static inline unsigned long resource_type(const struct resource *res)
Taking this suggestion from your other email:
if (!res.flags && !res.start && !res.end) return 0;
IMO, the caller shouldn't be calling resource_size() at all when !res.flags as that indicates the caller is confused and doesn't really know what it is even checking for. If a driver does even know if the resource is valid (has a valid type) or not, it should be checking that _first_ (it might needs the size for valid resources but it should IMO still check validity of the resource first).
Also, as mentioned above, some/many drivers are not fundamentally interested in validity itself but whether the resource is assigned to the resource tree.
For valid resources, this extra check is entirely unnecessary.
I'd be supportive of adding
WARN_ON_ONCE(!res.flags)
...here but that will be likely a bit too noisy without first auditting many places (from stable people's perspective a triggering WARN_ON() => DoS). But that's IMO the direction kernel code should be heading.
I'm more than happy to send a v2 of this fixing ath11k and adding the WARN_ON_ONCE with res.flags.
Seems the only correct way to track this down and improve this in the next kernel versions.
Also thanks to Andy and you for explaining this and reasurme me there wasn't a big bug around this helper but just some minor logical confusion.
Do you agree on the plan for v2? (we can assume the noise won't be so much as I expect only SOME driver to be affected by this problem) (I hope)
Yes please to fixing ath11k!
And if you're fine with dealing all the reports that will get directed to you because of adding that WARN_ON_ONCE() into resource_size() I'd be very happy of having to not do it myself eventually but please do realize it may be more than a few reports over a few next kernel releases. ;-)
That being said, I just tested on one machine here and it seems to not fire even once so we could get more lucky than I was assuming.
It might also be helpful to add kernel doc to resource_size() and mention it should not be used as a proxy for other things as we've discussed here.
On contrary to what I said earlier, I just realized the series adding resource_assigned() got accepted [1] in this cycle so it's now available in Linus' tree.
[1] https://lore.kernel.org/all/69339e215b09f_1e0210057@dwillia2-mobl4.notmuch/
linux-stable-mirror@lists.linaro.org