Hi Peng,
On Thu, Aug 31, 2023 at 10:45 AM Peng Zhang zhangpeng.00@bytedance.com wrote:
在 2023/8/31 16:25, Geert Uytterhoeven 写道:
On Thu, Aug 31, 2023 at 7:39 AM Michael Ellerman mpe@ellerman.id.au wrote:
Geert Uytterhoeven geert@linux-m68k.org writes:
On Fri, 18 Aug 2023, Liam R. Howlett wrote:
The current implementation of append may cause duplicate data and/or incorrect ranges to be returned to a reader during an update. Although this has not been reported or seen, disable the append write operation while the tree is in rcu mode out of an abundance of caution.
During the analysis of the mas_next_slot() the following was artificially created by separating the writer and reader code:
Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last slot write is to start of slot store current contents in slot overwrite old end pivot mas_next_slot(): read end metadata read old end pivot return with incorrect range store new value
Alternatively:
Writer: reader: mas_wr_append set end pivot updates end metata Detects write to last slot last lost write to end of slot store value mas_next_slot(): read end metadata read old end pivot read new end pivot return with incorrect range set old end pivot
There may be other accesses that are not safe since we are now updating both metadata and pointers, so disabling append if there could be rcu readers is the safest action.
Fixes: 54a611b60590 ("Maple Tree: add new data structure") Cc: stable@vger.kernel.org Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com
Thanks for your patch, which is now commit cfeb6ae8bcb96ccf ("maple_tree: disable mas_wr_append() when other readers are possible") in v6.5, and is being backported to stable.
On Renesas RZ/A1 and RZ/A2 (single-core Cortex-A9), this causes the following warning:
clocksource: timer@e803b000: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 28958491609 ns sched_clock: 32 bits at 66MHz, resolution 15ns, wraps every 32537631224ns /soc/timer@e803b000: used for clocksource /soc/timer@e803c000: used for clock events +------------[ cut here ]------------ +WARNING: CPU: 0 PID: 0 at init/main.c:992 start_kernel+0x2f0/0x480 +Interrupts were enabled early
...
I do not see this issue on any other platform (arm/arm64/risc-v/mips/sh/m68k), several of them use the same RCU configuration.
There's something similar on pmac32 / mac99.
Do you have a clue?
It seems something in the maple tree code is setting TIF_NEED_RESCHED, and that causes a subsequent call to cond_resched() to call schedule() and enable interrupts.
On pmac32 enabling CONFIG_DEBUG_ATOMIC_SLEEP fixes/hides the problem. But I don't see why.
Enabling CONFIG_DEBUG_ATOMIC_SLEEP on RZ/A1 and RZ/A2 does fix the problem. But there must be more to it, as some of my test configs had it enabled, and others hadn't, while only RZ/A showed the issue. I tried disabling it on R-Car M2-W (arm32) and R-Car H3 (arm64), and that did not cause the problem to happen...
I guess this patch triggers a potential problem with the maple tree. I don't have the environment to trigger the problem. Can you apply the following patch to see if the problem still exists? This can help locate the root cause. At least narrow down the scope of the code that has bug.
Thanks for your suggestion!
diff --git a/lib/maple_tree.c b/lib/maple_tree.c index f723024e1426..1b4b6f6e3095 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -4351,9 +4351,6 @@ static inline void mas_wr_modify(struct ma_wr_state *wr_mas) if (new_end == wr_mas->node_end && mas_wr_slot_store(wr_mas)) return;
if (mas_wr_node_store(wr_mas, new_end))
return;
if (mas_is_err(mas)) return;
Unfortunately removing these lines does not help.
Gr{oetje,eeting}s,
Geert