Re: [PATCH] arm64/pageattr: Propagate return value from __change_memory_common

12 Nov 2025


      On 12/11/25 5:15 am, Yang Shi wrote:
...
On 11/10/25 9:12 PM, Dev Jain wrote:
...
On 11/11/25 10:38 am, Yang Shi wrote:
...
On 11/10/25 8:55 PM, Dev Jain wrote:
...
On 11/11/25 10:14 am, Yang Shi wrote:
...
On 11/10/25 8:37 PM, Dev Jain wrote:
...
On 11/11/25 9:47 am, Yang Shi wrote:
>
>
> On 11/10/25 7:39 PM, Dev Jain wrote:
>>
>> On 05/11/25 9:27 am, Dev Jain wrote:
>>>
>>> On 04/11/25 6:26 pm, Will Deacon wrote:
>>>> On Tue, Nov 04, 2025 at 09:06:12AM +0530, Dev Jain wrote:
>>>>> On 04/11/25 12:15 am, Yang Shi wrote:
>>>>>> On 11/3/25 7:16 AM, Will Deacon wrote:
>>>>>>> On Mon, Nov 03, 2025 at 11:43:06AM +0530, Dev Jain wrote:
>>>>>>>> Post a166563e7ec3 ("arm64: mm: support large block 
>>>>>>>> mapping when
>>>>>>>> rodata=full"),
>>>>>>>> __change_memory_common has a real chance of failing due 
>>>>>>>> to split
>>>>>>>> failure.
>>>>>>>> Before that commit, this line was introduced in 
>>>>>>>> c55191e96caa,
>>>>>>>> still having
>>>>>>>> a chance of failing if it needs to allocate pagetable 
>>>>>>>> memory in
>>>>>>>> apply_to_page_range, although that has never been 
>>>>>>>> observed to be true.
>>>>>>>> In general, we should always propagate the return value 
>>>>>>>> to the caller.
>>>>>>>>
>>>>>>>> Cc: stable@vger.kernel.org
>>>>>>>> Fixes: c55191e96caa ("arm64: mm: apply r/o permissions of VM
>>>>>>>> areas to its linear alias as well")
>>>>>>>> Signed-off-by: Dev Jain dev.jain@arm.com
>>>>>>>> ---
>>>>>>>> Based on Linux 6.18-rc4.
>>>>>>>>
>>>>>>>>    arch/arm64/mm/pageattr.c | 5 ++++-
>>>>>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm64/mm/pageattr.c 
>>>>>>>> b/arch/arm64/mm/pageattr.c
>>>>>>>> index 5135f2d66958..b4ea86cd3a71 100644
>>>>>>>> --- a/arch/arm64/mm/pageattr.c
>>>>>>>> +++ b/arch/arm64/mm/pageattr.c
>>>>>>>> @@ -148,6 +148,7 @@ static int change_memory_common(unsigned
>>>>>>>> long addr, int numpages,
>>>>>>>>        unsigned long size = PAGE_SIZE * numpages;
>>>>>>>>        unsigned long end = start + size;
>>>>>>>>        struct vm_struct *area;
>>>>>>>> +    int ret;
>>>>>>>>        int i;
>>>>>>>>          if (!PAGE_ALIGNED(addr)) {
>>>>>>>> @@ -185,8 +186,10 @@ static int 
>>>>>>>> change_memory_common(unsigned
>>>>>>>> long addr, int numpages,
>>>>>>>>        if (rodata_full && (pgprot_val(set_mask) == 
>>>>>>>> PTE_RDONLY ||
>>>>>>>> pgprot_val(clear_mask) == PTE_RDONLY)) {
>>>>>>>>            for (i = 0; i < area->nr_pages; i++) {
>>>>>>>> - __change_memory_common((u64)page_address(area->pages[i]),
>>>>>>>> +            ret =
>>>>>>>> __change_memory_common((u64)page_address(area->pages[i]),
>>>>>>>>                               PAGE_SIZE, set_mask, 
>>>>>>>> clear_mask);
>>>>>>>> +            if (ret)
>>>>>>>> +                return ret;
>>>>>>> Hmm, this means we can return failure half-way through the 
>>>>>>> operation. Is
>>>>>>> that something callers are expecting to handle? If so, how 
>>>>>>> can they tell
>>>>>>> how far we got?
>>>>>> IIUC the callers don't have to know whether it is half-way 
>>>>>> or not
>>>>>> because the callers will change the permission back (e.g. 
>>>>>> to RW) for the
>>>>>> whole range when freeing memory.
>>>>> Yes, it is the caller's responsibility to set 
>>>>> VM_FLUSH_RESET_PERMS flag.
>>>>> Upon vfree(), it will change the direct map permissions back 
>>>>> to RW.
>>>> Ok, but vfree() ends up using update_range_prot() to do that 
>>>> and if we
>>>> need to worry about that failing (as per your commit 
>>>> message), then
>>>> we're in trouble because the calls to set_area_direct_map() 
>>>> are unchecked.
>>>>
>>>> In other words, this patch is either not necessary or it is 
>>>> incomplete.
>>>
>>> Here is the relevant email, in the discussion between Ryan and 
>>> Yang:
>>>
>>> https://lore.kernel.org/all/fe52a1d8-5211-4962-afc8-c3f9caf64119@os.ampereco... 
>>>
>>>
>>> We had concluded that all callers of set_memory_ro() or 
>>> set_memory_rox() (which require the
>>> linear map perm change back to default, upon vfree() ) will 
>>> call it for the entire region (vm_struct).
>>> So, when we do the set_direct_map_invalid_noflush, it is 
>>> guaranteed that the region has already
>>> been split. So this call cannot fail.
>>>
>>> https://lore.kernel.org/all/f8898c87-8f49-4ef2-86ae-b60bcf67658c@os.ampereco... 
>>>
>>>
>>> This email notes that there is some code doing set_memory_rw() 
>>> and unnecessarily setting the VM_FLUSH_RESET_PERMS
>>> flag, but in that case we don't care about the 
>>> set_direct_map_invalid_noflush call failing because the 
>>> protections
>>> are already RW.
>>>
>>> Although we had also observed that all of this is fragile and 
>>> depends on the caller doing the
>>> correct thing. The real solution should be somehow getting rid 
>>> of the BBM style invalidation.
>>> Ryan had proposed some methods in that email thread.
>>>
>>> One solution which I had thought of, is that, observe that we 
>>> are doing an overkill by
>>> setting the linear map to invalid and then default, for the 
>>> *entire* region. What we
>>> can do is iterate over the linear map alias of the vm_struct 
>>> *area and only change permission
>>> back to RW for the pages which are *not* RW. And, those 
>>> relevant mappings are guaranteed to
>>> be split because they were changed from RW to not RW.
>>
>> @Yang and Ryan,
>>
>> I saw Yang's patch here:
>> https://lore.kernel.org/all/20251023204428.477531-1-yang@os.amperecomputing.... 
>>
>> and realized that currently we are splitting away the linear 
>> map alias of the *entire* region.
>>
>> Shouldn't this then imply that set_direct_map_invalid_noflush 
>> will never fail, since even
>>
>> a set_memory_rox() call on a single page will split the linear 
>> map for the entire region,
>>
>> and thus there is no fragility here which we were discussing 
>> about? I may be forgetting
>>
>> something, this linear map stuff is confusing enough already.
>
> It still may fail due to page table allocation failure when 
> doing split. But it is still fine. We may run into 3 cases:
>
> 1. set_memory_rox succeed to split the whole range, then 
> set_direct_map_invalid_noflush() will succeed too
> 2. set_memory_rox fails to split, for example, just change 
> partial range permission due to page table allocation failure, 
> then set_direct_map_invalid_noflush() may
>    a. successfully change the permission back to default till 
> where set_memory_rox fails at since that range has been 
> successfully split. It is ok since the remaining range is 
> actually not changed to ro by set_memory_rox at all
>    b. successfully change the permission back to default for the 
> whole range (for example, memory pressure is mitigated when 
> set_direct_map_invalid_noflush() is called). It is definitely 
> fine as well
Correct, what I mean to imply here is that, your patch will break 
this? If set_memory_* is applied on x till y, your patch changes 
the linear map alias
only from x till y - set_direct_map_invalid_noflush instead 
operates on 0 till size - 1, where 0 <=x <=y <= size - 1. So, it 
may encounter a -ENOMEM
on [0, x) range while invalidating, and that is *not* okay 
because we must reset back [0, x) to default?
I see your point now. But I think the callers need to guarantee 
they call set_memory_rox and set_direct_map_invalid_noflush on the 
same range, right? Currently kernel just calls them on the whole 
area.
Nope. The fact that the kernel changes protections, and undoes the 
changed protections, on the *entire* alias of the vm_struct region, 
protects us from the fragility we were talking about earlier.
This is what I meant "kernel just calls them on the whole area".
...
Suppose you have a range from 0 till size - 1, and you call 
set_memory_* on a random point (page) p. The argument we discussed 
above is independent of p, which lets us drop our
previous erroneous conclusion that all of this works because no 
caller does a partial set_memory_*.
Sorry I don't follow you. What "erroneous conclusion" do you mean? 
You can call set_memory_* on a random point, but 
set_direct_map_invalid_noflush() should be called on the random 
point too. The current code of set_area_direct_map() doesn't 
consider this case because there is no such call. Is this what you 
meant?
I was referring to the discussion in the linear map work - I think we 
had concluded that we don't need to worry about the BBM style 
invalidation failing, *because* no one does a partial set_memory_*.
Yes, we don't have to worry about it.
...
What I am saying - we don't care whether caller does a partial or a 
full set_memory_*, we are still safe, because the linear map alias 
change on both sides (set_memory_* -> __change_memory_common, and 
vm_reset_perms -> set_area_direct_map() )
operate on the entire region.
Yes, this is the current behavior. My patch changes 
change_memory_common() to just do permission update for the requested 
range from the callers instead of assuming change the entire region, 
although there is no one calls set_memory_* on a partial range. Shall 
set_area_direct_map() be aware of potential partial range change from 
set_memory_*()? Maybe. But it is just called from vfree() which just 
free the entire region.
What happened if someone does something crazy, for example, call 
set_memory_* on a partial range, then call vfree? IIUC, it is fine as 
well. It is still covered by the 3 cases that I mentioned in the 
previous email if I don't miss anything, right?
Assuming the caller also does set_vm_flush_reset_perms(), the 3 cases 
will work out.
...
Thanks,
Yang
...
...
...
I would like to send a patch clearly documenting this behaviour, 
assuming no one else finds a hole in this reasoning.
Proper comment to explain the subtle behavior is definitely welcome.
Thanks,
Yang
...
...
Thanks,
Yang
...
>
> Hopefully I don't miss anything.
>
> Thanks,
> Yang
>
>
>>
>>
>>>
>>>>
>>>> Will
>>>
>

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] arm64/pageattr: Propagate return value from __change_memory_common