On Fri, May 14, 2021 at 10:50:01AM +0100, Catalin Marinas wrote:
To ensure that instructions are observable in a new mapping, the arm64 set_pte_at() implementation cleans the D-cache and invalidates the I-cache to the PoU. As an optimisation, this is only done on executable mappings and the PG_dcache_clean page flag is set to avoid future cache maintenance on the same page.
When two different processes map the same page (e.g. private executable file or shared mapping) there's a potential race on checking and setting PG_dcache_clean via set_pte_at() -> __sync_icache_dcache(). While on the fault paths the page is locked (PG_locked), mprotect() does not take the page lock. The result is that one process may see the PG_dcache_clean flag set but the I/D cache maintenance not yet performed.
Avoid test_and_set_bit(PG_dcache_clean) in favour of separate test_bit() and set_bit(). In the rare event of a race, the cache maintenance is done twice.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com Cc: stable@vger.kernel.org Cc: Will Deacon will@kernel.org Cc: Steven Price steven.price@arm.com
Found while debating with Steven a similar race on PG_mte_tagged. For the latter we'll have to take a lock but hopefully in practice it will only happen when restoring from swap. Separate thread anyway.
There's at least arch/arm with a similar race. Powerpc seems to do it properly with separate test/set. Other architectures have a bigger problem as they do a similar check in update_mmu_cache(), called after the pte was already exposed to user.
I looked at fixing this in the mprotect() code but taking the page lock will slow it down, so not sure how popular this would be for such a rare race.
arch/arm64/mm/flush.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index ac485163a4a7..6d44c028d1c9 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -55,8 +55,10 @@ void __sync_icache_dcache(pte_t pte) { struct page *page = pte_page(pte);
- if (!test_and_set_bit(PG_dcache_clean, &page->flags))
- if (!test_bit(PG_dcache_clean, &page->flags)) { sync_icache_aliases(page_address(page), page_size(page));
set_bit(PG_dcache_clean, &page->flags);
- }
Acked-by: Will Deacon will@kernel.org
I wondered about the ISB for a bit (we don't broadcast it), but should be fine as the racing CPU needs to return to userspace.
Will